Tag Archives: elasticsearch

Garbage Collection in Elasticsearch and the G1GC

garbage collection We have been using Elasticsearch as the storage and analysis tool for our centralized logging in addition to various full text search needs in different applications. The cluster being used for centralized logs handles a heavy load, indexing nearly 1.5 billion documents a day and going to about 50k documents per second during peak hours. The same cluster bears the load for searches to handle alerts and provide data to kibana dashboards. Hence the cluster performs large amount of aggregation queries in addition to document indexing. With heavy loads the cluster started having performance... Read More

Centralizing logs at Naukri.com with Kafka and ELK stack

Logs are important part of any system as they give deep insight into what is happening with the system. They also helps in figuring out what went wrong when something unexpected happens.  Most applications generates logs in one form or the other and they are generally written into files on the local disk. A web application consists of various components and each of them generates logs. Few of them are mentioned below. Access logs from web servers like Nginx, Apache Logs from back end applications (Java, PHP, python etc.) Logs from Database, Cache system, Queues etc. System... Read More

Managing Relations with Elasticsearch

1. Introduction Elasticsearch is a search engine. It provides a distributed full-text search engine over schema-free JSON documents. As we all know Elasticsearch is a great product to index and search through a large number of documents. It supports various functionality like term and range queries, full-text search and aggregations on large data sets that are very fast and powerful. It is built over Lucene which is a high-performance, full-featured text search engine library written in Java. When we are indexing data, the world is rarely simple as an independent data existing in... Read More