This page relate all the operations that can be useful to troubleshoot problems on Elasticsearch cluster.
Cluster management
List cluster nodes
> curl -XGET "http://127.1:9200/_cat/nodes?v" ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name 172.22.0.5 3 99 12 6.95 7.98 8.51 di - ingest_1 172.22.0.3 95 99 17 6.95 7.98 8.51 mdi * master_1 172.22.0.4 50 99 13 6.95 7.98 8.51 di - data_1 172.22.0.2 0 99 12 6.95 7.98 8.51 di - data_2
Find out data directories for a node
> curl -XGET "http://127.1:9200/_nodes/data_2/stats/fs?pretty" | grep path "path" : "/opt/es/data/nodes/1", > curl -XGET "http://127.1:9200/_nodes/data_1/stats/fs?pretty" | grep path "path" : "/opt/es/data/nodes/2", > curl -XGET "http://127.1:9200/_nodes/ingest_1/stats/fs?pretty" | grep path "path" : "/opt/es/data/nodes/3", > curl -XGET "http://127.1:9200/_nodes/master_1/stats/fs?pretty" | grep path "path" : "/opt/es/data/nodes/0",
Dump current cluster settings
> curl -XGET "http://127.1:9200/_cluster/settings?pretty" { "persistent" : { "cluster" : { "routing" : { "allocation" : { "cluster_concurrent_rebalance" : "32", "node_concurrent_recoveries" : "64", "exclude" : { "_ip" : "172.22.0.2, 172.22.0.3" }, "node_initial_primaries_recoveries" : "32", "enable" : "all" } } } }, "transient" : { } }
List all cluster pending tasks
The pending cluster tasks API returns a list of any cluster-level changes (e.g. create index, update mapping, allocate or fail shard) which have not yet been executed.
https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-pending-tasks.html
> curl -XGET "http://127.1:9200/_cat/pending_tasks?pretty"
Show global cluster health
> curl -XGET "http://127.1:9200/_cluster/health?pretty"
List per host threads
The thread_pool command shows cluster wide thread pool statistics per node. By default the active, queue and rejected statistics are returned for all thread pools.
https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-thread-pool.html
> curl -XGET "http://127.1:9200/_cat/thread_pool/generic?v&h=id,name,active,rejected,completed" > curl -XGET "http://127.1:9200/_cat/thread_pool/generic?v"
Show disk usage for all hosts
curl -XGET "http://localhost:9200/_cat/allocation?v&pretty"
Change minimum master nodes
https://www.elastic.co/guide/en/elasticsearch/reference/6.2/modules-discovery-zen.html
> curl -H 'Content-Type: application/json' -XPUT \ "http://127.1:9200/_cluster/settings?master_timeout=30m" \ -d '{ "persistent" : { "discovery.zen.minimum_master_nodes" : 18 } }'
Change search default timeout
in order to lower the effect of timing out search request, timeout on queries can be set
https://www.elastic.co/guide/en/elasticsearch/reference/6.0/search.html#global-search-timeout
curl \ -H 'Content-Type: application/json' \ -XPUT http://localhost:9200/_cluster/settings?master_timeout=30m \ -d '{ "persistent": { "search.default_search_timeout" : "300s" } }'
Nodes management
Use this API to review and change cluster-wide settings.
https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-update-settings.html
Delaying allocation when a node leaves
When a node leaves the cluster for whatever reason, intentional or otherwise, the master reacts by:
- Promoting a replica shard to primary to replace any primaries that were on the node.
- Allocating replica shards to replace the missing replicas (assuming there are enough nodes).
- Rebalancing shards evenly across the remaining nodes.
These actions are intended to protect the cluster against data loss by ensuring that every shard is fully replicated as soon as possible.
https://www.elastic.co/guide/en/elasticsearch/reference/current/delayed-allocation.html
> curl -H 'Content-Type: application/json' -XPUT "http://127.1:9200/_all/_settings?master_timeout=30m" \ -d '{ "settings" : { "index.unassigned.node_left.delayed_timeout": "15m" } }'
Exclude IP or HOST from cluster
You can use cluster-level shard allocation filters to control where Elasticsearch allocates shards from any index. These cluster wide filters are applied in conjunction with per-index allocation filtering and allocation awareness.
https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-filtering.html
> curl -H 'Content-Type: application/json' -XPUT "http://127.1:9200/_cluster/settings?master_timeout=30m" \ -d '{ "transient" : { "cluster.routing.allocation.exclude._ip": "1.1.1.1" } }'
or
> curl -H 'Content-Type: application/json' -XPUT "http://127.1:9200/_cluster/settings?master_timeout=30m" \ -d '{ "transient" : { "cluster.routing.allocation.exclude._host": "data_1" } }'
List hot threads
This API yields a breakdown of the hot threads on each selected node in the cluster
https://www.elastic.co/guide/en/elasticsearch/reference/6.6/cluster-nodes-hot-threads.html
> curl -X GET "http://127.1:9200/_nodes/hot_threads" > curl -X GET "http://127.1:9200/_nodes/node_ID/hot_threads"
Shards management
Use this API to review and change cluster-wide settings.
https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-update-settings.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/shards-allocation.html
List all shards
> curl -XGET "http://127.1:9200/_cat/shards?v"
List shard allocations per node
> curl -XGET "http://127.1:9200/_cat/allocation?v" shards disk.indices disk.used disk.avail disk.total disk.percent host ip node 1372 80.8gb 1.7tb 2.4tb 4.1tb 40 172.22.0.5 172.22.0.5 ingest_1 1380 45.8gb 1.7tb 2.4tb 4.1tb 40 172.22.0.4 172.22.0.4 data_1 5 133.9gb 1.7tb 2.4tb 4.1tb 40 172.22.0.3 172.22.0.3 master_1 161 8.2mb 1.7tb 2.4tb 4.1tb 40 172.22.0.2 172.22.0.2 data_2
List cluster shards allocation rules
The purpose of the cluster allocation explain API is to provide explanations for shard allocations in the cluster. For unassigned shards, the explain API provides an explanation for why the shard is unassigned. For assigned shards, the explain API provides an explanation for why the shard is remaining on its current node and has not moved or rebalanced to another node.
https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-allocation-explain.html
> curl -XGET "http://127.1:9200/_cluster/allocation/explain?pretty"
Deactivate shard reallocation
> curl -H 'Content-Type: application/json' -XPUT "http://127.1:9200/_cluster/settings?master_timeout=30m" \ -d '{ "transient" : { "cluster.routing.allocation.enable" : "none" } }'
Revert back shard reallocation to previous state
> curl -H 'Content-Type: application/json' -XPUT "http://127.1:9200/_cluster/settings?master_timeout=30m" \ -d '{ "transient" : { "cluster.routing.allocation.enable" : null } }'
Reactivate shard reallocation
> curl -H 'Content-Type: application/json' -XPUT "http://127.1:9200/_cluster/settings?master_timeout=30m" \ -d '{ "transient" : { "cluster.routing.allocation.enable" : "all" } }'
Change the number of initial shard reallocation and concurrent shard being reallocated simultaneously per node
> curl -H 'Content-Type: application/json' -XPUT "http://127.1:9200/_cluster/settings?master_timeout=30m" \ -d '{ "transient" : { "cluster.routing.allocation.node_concurrent_recoveries" : 36 } }'
and
> curl -H 'Content-Type: application/json' -XPUT "http://127.1:9200/_cluster/settings?master_timeout=30m" \ -d '{ "transient" : { "cluster.routing.allocation.node_initial_primaries_recoveries" : 36 } }'
Force reallocation of failed shards
> curl -H 'Content-Type: application/json' -XPOST "http://127.1:9200/_cluster/reroute?retry_failed=true"
Indices management
List indices
The indices command provides a cross-section of each index. This information spans nodes
https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-indices.html
> curl -X GET "http://127.1:9200/_cat/indices/.kib*?v&s=index"
List all **red** indices
> curl -XGET "http://127.1:9200/_cat/indices?v&health=red" health status index uuid pri rep docs.count docs.deleted store.size pri.store.size red open .monitoring-es-6-2019.05.15 TCJzdsLnRaqMn8YRWwnMIw 1 0 red open .monitoring-kibana-6-2019.05.21 FqxqFwFcSj2vYed_W_26Rg 1 0 red open .monitoring-es-6-2019.05.21 0ASH80vrQ7i_9mw19_QvhA 1 0 red open .monitoring-kibana-6-2019.05.16 DUydDdHSRA2hATqK3-suYw 1 0 red open .monitoring-kibana-6-2019.05.19 HFS8AY09QleNsubqOdZ-gA 1 0 red open .monitoring-kibana-6-2019.05.18 PcL_P-KDR8OQSVMw054XSg 1 0 red open .monitoring-kibana-6-2019.05.20 3OqAwzKgSXyVdDbQGA_qiA 1 0 red open .monitoring-es-6-2019.05.19 OFMac-OQRWKYGqd9jeOhJA 1 0 red open .monitoring-es-6-2019.05.18 fh2_G2tATxeboUDDjx0S8g 1 0 red open .monitoring-kibana-6-2019.05.15 MggMgNQXSmmB2DbRVe-tWA 1 0 red open .monitoring-es-6-2019.05.17 mFH9W76OSbOUhDqSSg3Bpg 1 0 red open .monitoring-es-6-2019.05.16 nrhk7lTBRa-CrBDtP3py7Q 1 0 red open .monitoring-kibana-6-2019.05.17 PRJ-48E8RuyqiOxJuS8H5g 1 0 red open .monitoring-es-6-2019.05.20 hpA6vUF8Smy5fOa75NbhGQ 1 0
Delete indices
> curl -XDELETE "http://127.1:9200/.kibana_1"
Open indices
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-open-close.html
> curl -s -XPOST "http://127.1:9200/myindice/_open" # Multiple indices can be opened at once > curl -s -XPOST "http://127.1:9200/myfirstindice,mysecondindice,mythirdindice/_open"
Close indices
> curl -s -XPOST "http://127.1:9200/myindice/_close" # Multiple indices can be closed at once > curl -s -XPOST "http://127.1:9200/myfirstindice,mysecondindice,mythirdindice/_close"
Tasks management
Lists tasks
Coupled with cluster management > List per host threads section this can allow you to show which tasks are creating overhead on the cluster
curl -GET 'localhost:9200/_tasks?pretty' "tasks" : { "2R6ZGJ7JSfOG4CAMaAjBTw:45767382" : { "node" : "2R6ZGJ7JSfOG4CAMaAjBTw", "id" : 45767382, "type" : "netty", "action" : "indices:data/read/search[phase/query]", "start_time_in_millis" : 1576682076364, "running_time_in_nanos" : 8576312675, "cancellable" : true, "parent_task_id" : "_Gy3HAVDT0ytQrN9D_Eh3A:43754248", "headers" : { } }
Delete tasks
curl -XPOST "localhost:9200/_tasks/parent_task_id/_cancel?pretty"; curl -XPOST "localhost:9200/_tasks/_Gy3HAVDT0ytQrN9D_Eh3A:43754248/_cancel?pretty";