Merge all nodes with the same property name

Hi, I've a problem that I do not know how to code in cypher. I have duplicate nodes with the same property name, (n.name) and they have their own relationships. I wanted to match these nodes, merges the properties and relationships of the 2nd through last nodes onto the first node, and deletes the 2nd through last nodes. I have the code below and it works.

MATCH (n:name)
WHERE n.name = "john"
WITH COLLECT(n) AS ns
CALL apoc.refactor.mergeNodes(ns) YIELD node
RETURN node;

However, I do not want to input the name manually. Is it possible to do a for loop all the nodes that have the same property name and do the above code? like john, jack, jane ...

to find nodes with the same property value

MATCH (n1:name),(n2:name)
WHERE n1.name = n2.name and id(n1) < id(n2)
WITH [n1,n2] as ns
CALL apoc.refactor.mergeNodes(ns) YIELD node
RETURN node

id(n1) < id(n2)
don't quite get this and got syntax error
does the id refer to the internal id that neo4j created? i do not have id as property in the node

Yes, this is internal id of Neo4j. I fixed the syntax error

works! thanks alot!!!!!

Hello, I have actually been looking for a solution like that as well. For me it however leads to a Cartesian product? Is there a way to avoid that?

Yes, it will look similar to the query in the first post, except we'll collect with respect to name (for each row/name, we'll get the collection of nodes with that name) and filter to only rows where there are multiple nodes for that name:

MATCH (n:name)
WITH n.name as name, COLLECT(n) AS ns
WHERE size(ns) > 1
CALL apoc.refactor.mergeNodes(ns) YIELD node
RETURN node;
1 Like

Brilliant, thanks a lot!!!

Hi, I've an error if the property of my nodes are uniquely indexed. Are there any work around with it?

New data does not satisfy CONSTRAINT ON ( label[0]:label[0] ) ASSERT label[0].property[0] IS UNIQUE.

Nvm solved it using this

CALL apoc.refactor.mergeNodes(nodes,{properties:"combine", mergeRels:true}) yield node

Hi andrew, is it possible to create case insensitive merge query like this one?

You mean case insensitive on the name property (or the common property used to identify duplicates), or case insensitive on the properties being merged?

If you want case insensitive on the name property, then it should be enough to use toLower() or toUpper() on that property at the time of the collection:

MATCH (n:name)
WITH toLower(n.name) as name, COLLECT(n) AS ns
WHERE size(ns) > 1
CALL apoc.refactor.mergeNodes(ns) YIELD node
RETURN node;
1 Like

Worked like a charm, thanks.

Hi, I've got a very similar question, but I am unsure on how to solve it.
I have a pretty big database (>3 000 000 nodes) and I'm trying to merge nodes but those who have multiple similar properties only.

First option works ok, but it's doing a cartesian product, and I fear of running out of memory, or that it will take ages to complete.

I want to use second option, but I don't quite understand how "WITH ... as ... COLLECT" works.
Is this query correct ?

MATCH (n:Word) WITH toLower(n.spelling) as spelling AND toLower(n.pos) as pos AND toLower(n.language) as spelling, COLLECT(n) AS ns
WHERE size(ns) > 1
CALL apoc.refactor.mergeNodes(ns) YIELD node
RETURN node;

I hope this is an appropriate place to ask my question. Thank you for ur help
Please excuse my fragile english