2
votes

I'm struggeling to setup hibernate seach using the elastic search backend in a spring boot setup.

What I have is spring boot and the following dependencies.

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
    <version>1.4.0.M3</version>
</dependency>

<dependency>
    <groupId>org.hibernate</groupId>
        <artifactId>hibernate-search-backend-elasticsearch</artifactId>
        <version>5.6.0.Alpha3</version>
</dependency>

What happens is, that hibernate search initializes before elastic search has finished starting.

Using the following property exposes the rest interface as well

spring:
   data:
      elasticsearch:
         properties:
            http:
               enabled: true

Causing an exception

Caused by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:9200 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect

Now, how do I define a dependency here?

I tried using a custom BeanFactoryPostProcessor to inject a dependency on elastic search, but that seems to be ignored in the auto configuration scenario.

Is there any way to introduce a wait until elastic search is up?

The setup works, when I set the hibernate index_management_strategyto NONE, but then the index is not configured and all custom analyzer annotations are ignored, defaulting to the default mappings in elastic search, which can not be configured in the auto configuration scenario.

Ideally elastic search should be hosted external to the jvm, but it's convenient in testing scenarios.

3
it tries to connect ES server on localhost, you're not running it obviously, so it's not about hibernatexeye
Not quite, through the magic of spring boot auto configuration a single node instance is started, but in the backgroundJoey

3 Answers

2
votes

I'm understanding this is an issue you're hitting during integration tests.

You could have a look at how we start ES during the integration tests within Hibernate Search itself, using a Maven plugin which makes sure the server is started before the tests: - https://github.com/hibernate/hibernate-search/blob/5.6.0.Beta1/elasticsearch/pom.xml#L341-L368

N.B. this uses a custom ES configuration, tuned to start quickly even though it's only a single node cluster: - https://raw.githubusercontent.com/hibernate/hibernate-search/5.6.0.Beta1/elasticsearch/elasticsearchconfiguration/elasticsearch.yml

Hibernate Search uses the Jest client to connect to ES, so it will require you to enable the HTTP connector of ES: let's not confuse this with a NodeClient, which is a different operating mode.

If your question isn't related to automated testing but rather production clusters, then I'd suggest using a Service Orchestrator like Kubernetes.

2
votes

Thanks to some help from the spring boot team, I was able to solve the issue - solution here.

The problem is that there's no dependency between the EntityManagerFactory bean and the Elasticsearch Client bean so there's no guarantee that Elasticsearch will start before Hibernate. As it happens, Hibernate starts first and then fails to connect to Elasticsearch.

This can be fixed by setting up a dependency between the two beans. An easy way to do that is with a subclass of EntityManagerFactoryDependsOnPostProcessor:

@Configuration
static class ElasticsearchJpaDependencyConfiguration extends EntityManagerFactoryDependsOnPostProcessor {

    public ElasticsearchJpaDependencyConfiguration() {
        super("elasticsearchClient");
    }

}

Now all that is needed is to set the number of replicas to 0 to fix the health status of the cluster in the single node deployment. This can be done by specifying an additional property in the application.properties file

spring.data.elasticsearch.properties.index.number_of_replicas= 0
1
votes

I checked sprint-data docs and looks like you misunderstood this piece (and actually it's confusing, guys don't understand the tech underneath?)

By default the instance will attempt to connect to a local in-memory server (a NodeClient in Elasticsearch terms), but you can switch to a remote server (i.e. a TransportClient) by setting spring.data.elasticsearch.cluster-nodes to a comma-separated ‘host:port’ list.

NodeClient is not "local server", it's special type of ES client. This local client can connect to ES cluster nodes containing data, and as I said in the comment, you don't have any ES data nodes running. Read this for better understanding https://www.elastic.co/guide/en/elasticsearch/guide/current/_transport_client_versus_node_client.html