I am using Spring Data Neo4j 3.3.1.RELEASE with Neo4j server 2.2.3.
My problem is there are some nodes which are duplicates of my entities but has only the indexed property in it.
My class looks something like this
@NodeEntity
@TypeAlias("Product")
public class Product {
@GraphId
private Long graphId;
@Indexed(indexName="productId", unique=true, indexType=IndexType.SIMPLE)
private String productId;
private String productType;
...
}
When a new node is created, I first check if there is an existing one and update it if it exists, otherwise create a new one.
Product product = productRepository.findByProductId(productId);
if (product == null) {
product = new Product(productId);
}
...
productRepository.save(product);
The repository interface.
public interface ProductRepository extends GraphRepository<Product> {
public Product findByProductId(String productId);
}
In Neo4j, the entity is created to a node with all the properties. But some nodes also have a duplicate node which only contain the productId. The thing is this doesn't happen to all the nodes. As of now we have about 120,000 nodes and as much as 30 nodes have this duplicate. Every time we re-ingest the data there are duplicates. Right now we only have 2 duplicate nodes.
One more thing, upon checking the duplicate nodes, it seems that they have a node ID in sequence which I think that they are created together when I save the entity.
EDIT:
Upon investigation, it seems that the unique constraint is not applied to the productId. The problem seems to be from @Indexed annotation. If I used unique
and indexName
in the same annotation, only the indexName
is applied and not the constraint. Now if I use either indexName
or unique
SDN can create one of it and I must create the other via the Neo4j webconsole which is kind of annoying. I know that in SDN 4.x.x index maintenance will not be part of the code and should be handled externally. Is this something that we need to do now since SDN 3.3.x does not handle it correctly?