There’s a some sort of race condition / thread safety issue either with neo4j-spatial and/or Spring Data for Neo4j that causes the data kept by the Spatial Index Provider to get corrupted.
Specifically, a spatial node is only supposed to have one incoming relationship of type RTREE_REFERENCE. However, under some circumstances more than one relationship of this type gets created, and any attempt to set values on properties indexed with the Spatial Index Provider on these nodes will fail with the following exception:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
Caused by: org.neo4j.graphdb.NotFoundException: More than one relationship[RTREE_REFERENCE, INCOMING] found for NodeImpl#3526 at org.neo4j.kernel.impl.core.NodeImpl.getSingleRelationship(NodeImpl.java:258) at org.neo4j.kernel.impl.core.NodeProxy.getSingleRelationship(NodeProxy.java:125) at org.neo4j.collections.rtree.RTreeIndex.findLeafContainingGeometryNode(RTreeIndex.java:796) at org.neo4j.collections.rtree.RTreeIndex.remove(RTreeIndex.java:110) at org.neo4j.collections.rtree.RTreeIndex.remove(RTreeIndex.java:99) at org.neo4j.gis.spatial.EditableLayerImpl.update(EditableLayerImpl.java:58) at org.neo4j.gis.spatial.indexprovider.LayerNodeIndex.add(LayerNodeIndex.java:144) at org.neo4j.gis.spatial.indexprovider.LayerNodeIndex.add(LayerNodeIndex.java:52) at org.springframework.data.neo4j.fieldaccess.IndexingPropertyFieldAccessorListenerFactory$IndexingPropertyFieldAccessorListener.valueChanged(IndexingPropertyFieldAccessorListenerFactory.java:86) at org.springframework.data.neo4j.fieldaccess.DefaultEntityState.notifyListeners(DefaultEntityState.java:137) at org.springframework.data.neo4j.fieldaccess.DefaultEntityState.setValue(DefaultEntityState.java:114) at org.springframework.data.neo4j.fieldaccess.DetachedEntityState.setValue(DetachedEntityState.java:158) at org.springframework.data.neo4j.fieldaccess.DetachedEntityState.setValue(DetachedEntityState.java:137) |
I have been unable to reproduce the conditions by which this issue occurs in a test case of any sorts, so finding a cause or solution has been difficult.
While the issue continues to get debugged I put in a temporary solution that at least removes the failures when setting the spatial properties on the node entities.
I implemented a simple monitor that searches for nodes with more than one incoming RTREE_REFERENCE relationship, then deletes all but one of those relationships from the node. I’m using Spring Framework’s task scheduler functionality.
I created a simple Java bean for the monitor:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
package fi.iki.tpp.graph.neo4j.spatial; import java.util.Iterator; import java.util.List; import org.apache.log4j.Logger; import org.neo4j.cypher.javacompat.ExecutionEngine; import org.neo4j.cypher.javacompat.ExecutionResult; import org.neo4j.graphdb.Direction; import org.neo4j.graphdb.GraphDatabaseService; import org.neo4j.graphdb.Node; import org.neo4j.graphdb.NotFoundException; import org.neo4j.graphdb.Relationship; import org.neo4j.graphdb.RelationshipType; import org.neo4j.graphdb.Transaction; import org.neo4j.helpers.collection.IteratorUtil; import com.google.common.collect.ImmutableList; public class SpatialReferenceBugMonitor { private static final Logger logger = Logger.getLogger(SpatialReferenceBugMonitor.class); private static final String QUERY = "start n = node(*) match (n)<-[r:RTREE_REFERENCE]-() with n, count(r) as rel_count where rel_count > 1 return n"; private static final String RELATIONSHIP_TYPE = "RTREE_REFERENCE"; private final GraphDatabaseService service; public SpatialReferenceBugMonitor(GraphDatabaseService service) { this.service = service; } public void run() { logger.debug("Running"); ExecutionEngine engine = new ExecutionEngine(service); ExecutionResult result = engine.execute(QUERY); Transaction tx = service.beginTx(); try { Iterator<Node> nodes = result.columnAs("n"); for (Node node : IteratorUtil.asIterable(nodes)) { try { List<Relationship> references = ImmutableList.copyOf(node.getRelationships(Direction.INCOMING, new RelationshipType() { @Override public String name() { return RELATIONSHIP_TYPE; } })); for (int i = 1; i < references.size(); i++) { Relationship reference = references.get(i); logger.debug(String.format("Deleting reference %d between %d and %d", reference.getId(), reference.getStartNode().getId(), reference.getEndNode().getId())); reference.delete(); } } catch (NotFoundException e) { // ignore } } tx.success(); } finally { tx.finish(); } logger.debug("Finished"); } } |
I then created the task scheduler and the scheduled tasks it executes in a Spring Framework configuration file as follows:
1 2 3 4 5 6 7 8 9 |
<bean id="spatialBugMonitor" class="fi.iki.tpp.graph.neo4j.spatial.SpatialReferenceBugMonitor"> <constructor-arg ref="graphDatabaseService"/> </bean> <task:scheduler id="monitorScheduler" pool-size="1"/> <task:scheduled-tasks scheduler="monitorScheduler"> <task:scheduled ref="spatialBugMonitor" method="run" fixed-rate="1200000"/> </task:scheduled-tasks> |
graphDatabaseService refers to the Neo4j GraphDatabaseService interface implementation, in my case org.neo4j.kernel.EmbeddedGraphDatabase.
Spring Framework takes care of the execution of the scheduled tasks automatically. I’ve set the monitor to run every 20 minutes, as we’re importing data into the database every 30 minutes or so.