Knowledge Discovery: Preprocessing Technique in Web Server Data log

Supriyadi Supriyadi

Abstract


Preprocessing is a process to avail a clean and ready data for data mining analysis. Four stages are carried out in this process i.e. data selecting, data cleaning, data integrating and data transforming. The numerous semi structured pattern of web server log data accumulation has made the data cleaning-up to select the really needed data difficult. Hence, in this research a technique to carry out data cleaning using parser algorithm and query processing is proposed. The parser algorithm is written using PHP programming as web based programming, while the query processing is implemented using relationship database management system (RDBMS) MySQL. The computation system is tested using twenty six different sized trial data, originated from six web server. The final result concluded that in general the data being tested have 80 percent decreasing average with average processing velocity of 9.28 mbps

Keywords: query processing, parser algorithm, semi structured data, web log data


Full Text:

PDF

References


D. Tomar and S. Agarwal, “A survey on pre-processing and post-processing techniques in data mining,” IJDTA :International Journal of Database Theory and Application, 2014.

K. K. Pandey and N. Pradhan, “An analytical and comparative study of various data preprocessing method in data mining,” IJETAE : The International Journal of Emerging Technology and Advanced Engineering, 2014.

B. Maheswari and P.Sumathi, “An effective method to preprocess the data in web usage mining,” ARPN Journal of Science and Technology, 2013.

S. Fong, R. P. Biuk-Aghai, Y. whar Si, and B. W. Yap, “A lightweight data preprocessing strategy with fast contradiction analysis for incremental classifier learning,” Hindawi Publishing Corporation Mathematical Problems in Engineering, 2015.

M. Agung and A. I. Kistijantoro, “High performance cdr processing with mapreduce,”

Journal ICT Research and Application, 2016.

D. Peralta, S. del Ro, S. Ramrez-Gallego, I. Triguero, J. M. Benitez, and F. Herrera,

“Evolutionary feature selection for big data classification: A mapreduce approach,” Hindawi Publishing Corporation Mathematical Problems in Engineering, 2015.

B. J. G. S. H. F. Triguero I., Peralta D., “Mrpr: A mapreduce solution for prototype reduction in big data classification,” neurocomputing, 2015.

D. C.E, “An application for clickstream analysis,” IJCC: International Journal of Computers and Communications, 2012.

A. R. Anand S., “An efficient algorithm for data cleaning of log file using file extensions,” IJCA: International Journal of Computer Applications, 2012.

N. D. Grace L.K.J., Maheswari V., “Analysis of web logs and web user in web mining,” IJNSA: International Journal of Network Security and Its Applications, 2011.

R. K. Makwana C.H., “An efficient technique for web log preprocessing using microsoft excel,” IJCA: International Journal of Computer Applications, 2014.




DOI: http://dx.doi.org/10.14203/j.inkom.574

Refbacks

  • There are currently no refbacks.

Comments on this article

View all comments
 |  Add comment

INKOM - Jurnal Informatika, Sistem Kendali dan Komputer