Category Archives: Platform Development

Understanding Disk I/O – when should you be worried?

Monitoring and analyzing performance is an important task for any System Administrators. Disk I/O bottlenecks can bring applications to a crawl. Some of the common questions for anyone embarking on a disk I/O analysis What are IOPS? Should I use SATA, SAS, or SSD? What RAID level should I use? Is my system read or write-heavy?   Disclaimer: I do not consider myself an expert in storage or anything for that matter. This is just how I have done I/O analysis in the past. I welcome additions and corrections   What are IOPS? They are input-output (I/O)... Read More

Understanding Linux CPU Load – when should you be worried?

Understanding Linux CPU Load - when one should be worried? All must be familiar with Linux load averages. Load averages are the three numbers shown with the uptime and top commands - it looks like this: load average: 0.09, 0.05, 0.01 Most administrators have a notion of what the load averages mean are the three numbers represent averages over progressively longer periods of time (one, five, and fifteen-minute averages), and that lower numbers are better. Higher numbers represent a problem or an overloaded machine. But, what's the threshold? What constitutes "good" and "bad"... Read More

Garbage Collection in Elasticsearch and the G1GC

garbage collection We have been using Elasticsearch as the storage and analysis tool for our centralized logging in addition to various full text search needs in different applications. The cluster being used for centralized logs handles a heavy load, indexing nearly 1.5 billion documents a day and going to about 50k documents per second during peak hours. The same cluster bears the load for searches to handle alerts and provide data to kibana dashboards. Hence the cluster performs large amount of aggregation queries in addition to document indexing. With heavy loads the cluster started having performance... Read More

Centralizing logs at Naukri.com with Kafka and ELK stack

Logs are important part of any system as they give deep insight into what is happening with the system. They also helps in figuring out what went wrong when something unexpected happens.  Most applications generates logs in one form or the other and they are generally written into files on the local disk. A web application consists of various components and each of them generates logs. Few of them are mentioned below. Access logs from web servers like Nginx, Apache Logs from back end applications (Java, PHP, python etc.) Logs from Database, Cache system, Queues etc. System... Read More