DOI:10.20894/IJDMTA.
Periodicity: Bi Annual.
Impact Factor:
SJIF:4.893 & GIF:0.787
Submission:Any Time
Publisher: IIR Groups
Language: English
Review Process:
Double Blinded

News and Updates

Author can submit their paper through online submission. Click here

Paper Submission -> Blind Peer Review Process -> Acceptance -> Publication.

On an average time is 3 to 5 days from submission to first decision of manuscripts.

Double blind review and Plagiarism report ensure the originality

IJDMTA provides online manuscript tracking system.

Every issue of Journal of IJDMTA is available online from volume 1 issue 1 to the latest published issue with month and year.

Paper Submission:
Any Time
Review process:
One to Two week
Journal Publication:
June / December

IJDMTA special issue invites the papers from the NATIONAL CONFERENCE, INTERNATIONAL CONFERENCE, SEMINAR conducted by colleges, university, etc. The Group of paper will accept with some concession and will publish in IJDMTA website. For complete procedure, contact us at admin@iirgroups.org

Paper Template
Copyright Form
Subscription Form
web counter
web counter
Published in:   Vol. 7 Issue 2 Date of Publication:   December 2018

Silhouette Threshold Based Text Clustering for Log Analysis

Jayadeep J

Page(s):   71-79 ISSN:   2278-2397
DOI:   10.20894/IJDMTA.102.007.002.004 Publisher:   Integrated Intelligent Research (IIR)

Automated log analysis has been a dominant subject area of interest to both industry and academics alike. The heterogeneous nature of system logs, the disparate sources of logs (Infrastructure, Networks, Databases and Applications) and their underlying structure & formats makes the challenge harder. In this paper I present the less frequently used document clustering techniques to dynamically organize real time log events (e.g. Errors, warnings) to specific categories that are pre-built from a corpus of log archives. This kind of syntactic log categorization can be exploited for automatic log monitoring, priority flagging and dynamic solution recommendation systems. I propose practical strategies to cluster and correlate high volume log archives and high velocity real time log events; both in terms of solution quality and computational efficiency. First I compare two traditional partitional document clustering approaches to categorize high dimensional log corpus. In order to select a suitable model for our problem, Entropy, Purity and Silhouette Index are used to evaluate these different learning approaches. Then I propose computationally efficient approaches to generate vector space model for the real time log events. Then to dynamically relate them to the categories from the corpus, I suggest the use of a combination of critical distance measure and least distance approach. In addition, I introduce and evaluate three different critical distance measures to ascertain if the real time event belongs to a totally new category that is unobserved in the corpus.