Performance of filtering by cumulative relationship property (distance)

Hi there

(cross-posting from SO)

I'm trying to get all vertices up to a certain maximum cumulative weight (distance) from a specific node. The query

MATCH route = (p1:ReferencePlace) - [roadlist:EROAD*..5] - (p2:ReferencePlace)
WITH p1, p2, route, roadlist, 
	REDUCE(sum=0.0, road in roadlist | sum + toFloat(road.distance)) as totaldistance
WHERE totaldistance < 300
AND p1.name = "Paris"
RETURN p1, p2, totaldistance

produces the right output. This uses the E-road data example imported like here. It returns all places which are less than 300 km from Paris.

The problem is, that this only works for a limited number of hops EROAD*..5]. For EROAD*] it "hangs" (I don't know whether it would finish in any reasonable amount of time). This is because Neo4j first finds all possible routes and then filters them. Therefore, it makes sense that the number of routes gets infeasibly large even for a small graph like this.

In theory it would be no big deal to implement a BFS algorithm from scratch which just gathers all relevant vertices as long as long as the cumulative distance is smaller than the threshold and only visits the relevant ones. But I'm wondering whether there's a Neo4j way of doing this.

Hi @mcsoini,

Welcome to the community,

I tried to replicate your issue and I faced the same problem.
However I have some other dataset also (very big than yours) and when I tried to run similar query there it gave me result.
I am not very sure but I suspect there is some cycle because of that result is not coming(out of memory).
Unfortunately I did not find any suitable query to find the cycle

1 Like

Thanks for the reply @intouch.vivek. Yes, there are definitely cycles in that graph. Based on your comment I experimented with an intermediate MST step to simplify the graph:

// 02 GET MST FROM PARIS
MATCH (n:ReferencePlace {name:"Paris"})
CALL algo.spanningTree.minimum('ReferencePlace', 'EROAD', 'distance', id(n),
  {write:true, writeProperty:"MINST"})
YIELD loadMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN loadMillis, computeMillis, writeMillis, effectiveNodeCount;

// 03 UPDATE MINST RELATIONSHIPS WITH EROAD DISTANCES
MATCH (p1) - [m:MINST] - (p2), (p1) - [r:EROAD] - (p2)
SET m.distance = r.distance;

This allows to run my originally suggested query without limitations to the number of hops. However, calculating the MST drops some solutions to my original problems.

I finally discovered algo.shortestPath.deltaStepping.stream, which I use as follows:

PROFILE MATCH (start:ReferencePlace {name:'Paris'})
CALL algo.shortestPath.deltaStepping.stream(start, 'distance', 3.0, {relationshipQuery: 'EROAD'})
YIELD nodeId, distance
WHERE distance <= 800
RETURN algo.asNode(nodeId) AS destination, distance

This seems to be doing what I want, even though conceptually it's not a straight-forward approach to my problem. Will see how it scales for larger graphs.