Return relationships traversed, not paths from shortest path

Hi all,
I'm just starting using neo4j, and I'm having a few problems finding all the details of what information is able to be extracted from paths in neo4j.

In particular, for the shortestPath and kShortestPaths algorithms, I seem to only be able to extract the nodes from the path, not the relationship elements.

The query I'm using is:

MATCH (start {REFERENCE:'NB7864'}), (end {REFERENCE:'NB4642'})
CALL, end, 'weight')
YIELD nodeId, cost
RETURN algo.asNode(nodeId), cost

MATCH (start {REFERENCE:'NB7864'}), (end {REFERENCE:'NB4642'})
CALL, end, 5, 'weight', {path:true})
YIELD path, costs
RETURN path, costs

But I'm not sure how to extract the edges from the returned path or values.

Thanks for any help.

Michael Webster

1 Like

Hi, @mike.webster maybe you could provide sample what output are you aiming for? Do you want to get relationships between nodes in this path?

apoc.algo.cover(nodes) yield rel - returns all relationships between this set of nodes

MATCH (start {title: 'The Birdcage'}), (end {title: 'The Matrix'}) 
CALL, end)
YIELD nodeId
WITH collect(nodeId) as nodeList
CALL apoc.algo.cover(nodeList) 

Yes, that's just what I want, thanks. I'm just trying to adapt data in a GIS db about fibre optics, and I need to map back to the fibre cables that represent the links between nodes, rather than just the nodes themselves.

Thanks very much.

Actually - it's not exactly what I want, as it seems to return all edges from the nodes in the path, not the edges used in the path.

What I'd like ideally is that I would have returned:

node1 -> edgeTaken -> node2 -> edgeTaken -> ... lastNode

I notice the graphical view shows that, but I'm finding it hard to get back a textual representation of it.

So you want node-relationship-node-... ?
apoc.path.elements(path) returns a list of node-relationship-node-…​

MATCH (start {title: 'The Birdcage'}), (end {title: 'The Matrix'}), path = shortestpath((start)-[*]-(end))
RETURN apoc.path.elements(path)

β”‚"apoc.path.elements(path)"                                            β”‚
β”‚[{"title":"The Birdcage","tagline":"Come as you are","released":1996},β”‚
β”‚{"summary":"Slapstick redeemed only by the Robin Williams and Gene Hacβ”‚
β”‚kman's stellar performances","rating":45},{"name":"Jessica Thompson"},β”‚
β”‚{"summary":"An amazing journey","rating":95},{"title":"Cloud Atlas","tβ”‚
β”‚agline":"Everything is connected","released":2012},{},{"name":"Lana Waβ”‚
β”‚chowski","born":1965},{},{"title":"The Matrix","tagline":"Welcome to tβ”‚
β”‚he Real World","released":1999}]                                      β”‚
MATCH (start {REFERENCE:'NB7864'}), (end {REFERENCE:'NB4642'})
CALL, end, 5, 'weight', {path:true})
YIELD index, nodeIds, costs
MATCH path = (a)-[r]-(b)
where id(a) in nodeIds and id(b) in nodeIds and r.weight in costs
with distinct(r) as rls
return rls

I hope this helps.

Thanks very much for that. I don't think I can use the builtin shortest path for a weighted graph, and I can't get the path = shortestPath(...) syntax to work with the algo.shortestPath method.

I think jaini's solution below helps.

Note also that your Cypher doesn't have any labels, so you're performing 2 AllNodesScans of all nodes in db to find your start and end nodes, which will get more and more expensive as your graph grows:

So use labels, and ensure you have indexes such that the EXPLAIN plan of your query shows index lookups are being used for each node.

Thanks jaini, I can run that query, and it does extract the paths. I suspect you don't need distinct in the query, because that's guaranteed by the algorithm?

In any case, I've tried modifying it to get a format like this:

n1 , rel1, n2
n2, rel2, n3
n(k-1), rel(k-1),nk

with this query:

CALL, end, 0, 'weight', {path:true})
YIELD index, nodeIds, costs
MATCH path = (a)-[r]->(b)
where id(a) in nodeIds and id(b) in nodeIds and r.weight in costs
//with distinct(r) as rls, a, b
return a, r, b

But the ordering seems out.

BTW: Does the MATCH path = (a)-[r]-(b) part of the query match back to the path returned by the shortestPaths algorithm, or is it matching against the entire graph? Because that could be problematic.


Actually, I put an index on REFERENCE that I hoped would do the trick.

I'm now thinking I need to look into the cypher projection, and nodeQuery and relationshipQuery to solve this issue.

Just another question relating to using labels - I have several elements in the domain all of which can be in a path - namely buildings, enclosures, hubsites - and probably few others later.

Is it worthwhile creating a separate label that encompasses all those other labels? That would mean I could use a label in the query above, something like 'FibreTerminationPoint'.

The ordering isn't right here, what you're doing in this query is doing the shortest path streaming, and then for each row result, performing a MATCH of a path for every occurrence of the pattern across all nodes of your graph, that will have terrible performance and won't give you what you want as a result.

Using labels is a good start for your start and end nodes, but since your intent is to lookup specific nodes by a property, you need to create indexes so your node lookups are quick.

As for your shortestPath calls, it looks like can YIELD a path variable, maybe that will work for you.

Do you have weight properties that need to be evaluated, or are your shortest paths only with respect to the length of the paths and don't need to take any kind of property weight into account? If the latter, there are other approaches we can use, some of which are good at filtering by label.

Yes - I don't think that solution is correct. I suspect I'm rematching against the entire graph, which means I could pick up extraneous relations.

I've tried returning path, but all it seems to contain is the cost property which I'm passing in as weight.

I do need to use some kind of 'weight' value, so the graph algorithm versions of shortestPath seem appropriate.

My current theory is that I need to use cypher projection and nodeQuery and relationshipQuery to help achieve what I'm after. Later on I will also want to put some further constraints on the paths that are followed, such as looking at parameters of nodes that indicate a measure of congestion.

I had hoped I might be able to pass a function into shortestPath to calculate a weight at runtime, but I haven't seen how to do this - maybe the nodeQuery and relationshipQuery parameters can help with this.

Possibly I can write a server extension if I can't find anything else to do what I want.

Currently I'm indexing every node based on it's REFERENCE field, eg. NB7864.

Thanks for the suggestions - I'm trying to very rapidly come up to speed.

Just keep in mind for indexes, your pattern MUST contain the label for which the index is under, as well as a predicate on the indexed property. You can use an EXPLAIN of the plan to double-check that an index lookup is being used, and not a more expensive NodeByLabelScan or AllNodesScan.