ElasticSearch will become unresponsive with a 503 error.

book

Article ID: 100061804

calendar_today

Updated On:

Description

Error Message

API calls may show the following:  503 master_not_discovered_exception

Dtrace may show the following: 
4192532   23:59:33.112    [10500]   (EVIndexQueryServer)   <31452>   EV-H   {ESSearch.ExecuteQuery} Elasticsearch failed to complete search request with error: Elasticsearch.Net.ElasticsearchClientException: The remote server returned an error: (503) Server Unavailable. Call: Status code 503 from: POST /15d50e9d3feff454cb2a6e0f775f6dae5_48403%2A/_search?typed_keys=true&ignore_unavailable=true. ServerError: Type: search_phase_execution_exception Reason: "all shards failed" ---> System.Net.WebException: The remote server returned an error: (503) Server Unavailable.|  at System.Net.HttpWebRequest.GetResponse()|  at Elasticsearch.Net.HttpWebRequestConnection.Request[TResponse](RequestData requestData)|  — End of inner exception stack trace ---|  at Symantec.EnterpriseVault.Indexing.IndexingEngine.ESClient.GetData[T](ESInputQuery esQuery)|  at Symantec.EnterpriseVault.Indexing.Search.ESSearch.ExecuteQuery(String index, BoolQuery boolQuery, List`1 sortData, String[] storedFields, Int32 startResultsFrom, Int32 resultsWindowSize, Boolean logQuery, Int32 timeoutInSec, String computerEntryId, Int64& totalHits, TruncationReason& truncationReason)

The ElasticSearchLogger may show the following:
(EVIndexingEngineElasticsearch)        EV-H    {ElasticsearchLogger} Caused by: java.io.IOException: An unexpected network error occurred

Cause

This issue can be due to storage that stores the ElasticSearch indexes being on a network or storage that is performing poorly. ElasticSearch requires the file system to act as if it were stored on a local disk. Additionally, other factors in combination with under performing networks and disks may contribute to this issue. Items such as JVM Heap usage not being sized appropriately and corruption of indices. 

Resolution

  • Check for network issues related to poor or intermittent performance and resolve them as required. 
  • Ensure that for every 20 shards on the Elastic Node there is 1 GB of JVM Heap assigned. (Maximum of 30 GB or 50% of Physical Memory, whichever is lower).
  • Ensure Anti-Virus exclusions are in place.

Issue/Introduction

Periodically, ElasticSearch will become unresponsive, forcing a restart of the IIS and the indexing service to get it working again.

Additional Information

JIRA: CFT-5898