1
votes

We are running a 3.1 corda network with a notary, 3 party nodes, and a network map service. Each node has a persistent postgresql database. When we restart a node, we get the following stack trace:

[ERROR] 2018-05-31T13:53:37,386Z [main] internal.Node.run - Exception during node startup {}
java.lang.IllegalArgumentException: More than one node found with legal name O=*****, L=*****, C=**
        at net.corda.node.services.network.PersistentNetworkMapCache.getNodeByLegalName(PersistentNetworkMapCache.kt:161) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.services.network.NetworkMapCacheImpl.getNodeByLegalName(PersistentNetworkMapCache.kt) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.AbstractNode.updateNodeInfo(AbstractNode.kt:324) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.AbstractNode.access$updateNodeInfo(AbstractNode.kt:107) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.AbstractNode$start$4.invoke(AbstractNode.kt:210) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.AbstractNode$start$4.invoke(AbstractNode.kt:107) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.AbstractNode$initialiseDatabasePersistence$2.invoke(AbstractNode.kt:673) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.AbstractNode$initialiseDatabasePersistence$2.invoke(AbstractNode.kt:107) ~[corda-node-3.1-corda.jar:?]
        at net.corda.nodeapi.internal.persistence.CordaPersistence.inTopLevelTransaction(CordaPersistence.kt:148) ~[corda-node-api-3.1-corda.jar:?]
        at net.corda.nodeapi.internal.persistence.CordaPersistence.transaction(CordaPersistence.kt:134) ~[corda-node-api-3.1-corda.jar:?]
        at net.corda.nodeapi.internal.persistence.CordaPersistence.transaction(CordaPersistence.kt:120) ~[corda-node-api-3.1-corda.jar:?]
        at net.corda.nodeapi.internal.persistence.CordaPersistence.transaction(CordaPersistence.kt:127) ~[corda-node-api-3.1-corda.jar:?]
        at net.corda.node.internal.AbstractNode.initialiseDatabasePersistence(AbstractNode.kt:672) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.Node.initialiseDatabasePersistence(Node.kt:337) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.AbstractNode.start(AbstractNode.kt:208) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.Node.start(Node.kt:351) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.NodeStartup.startNode(NodeStartup.kt:140) ~[corda-node-3.1-corda.jar:?]
        at net.corda.node.internal.NodeStartup.run(NodeStartup.kt:114) [corda-node-3.1-corda.jar:?]
        at net.corda.node.Corda.main(Corda.kt:13) [corda-node-3.1-corda.jar:?]

Would this indicate a problem with how our network map service is registering nodes or something else?

2
In Corda 3, the network map node has been removed. Are you still including it in your network?Joel
Thanks for your response @Joel! We are running a network map service (spring boot) based upon Stefano's sample: github.com/roastario/spring-boot-network-map. We aren't using the network bootstrapping tool but are still running our nodes in 'dev mode' deployed via CFT on AWS. We are trying to take it one step at a time to better understand a production network. Perhaps we need to take it a bit further and remove the 'dev mode' and build our own certificates?Bret
Hey, I'm just back from holiday - I will take a look at this!Stefano

2 Answers

1
votes

So, after some investigation, I believe it comes down to

  1. Node X starts up in devMode, generates a keypair and publishes it's nodeInfo to the NetworkMap

  2. Corda downloads the existing network and internally de-duplicates nodeInfos by using the nodes public key as the dedupe key

  3. Node X is shutdown

  4. Node X has it's local storage reset

  5. Node X starts up and generates a new key pair, publishes its nodeInfo

  6. The keypair has changed, which means the deduplication does not work

  7. Corda throws an exception, because it looks like there is two nodes trying to impersonate CN=xxxx, O=xx ...

So, we will have a discussion here, is it a realistic scenario for a node to change keypairs, and if so ... should we throw an exception on finding two nodes with the same X500, but different public keys.

For now, I would recommend having at least persistent certificates folder. This will prevent the node regenerating the keypair on restart/rebuild.

I'll also add an endpoint to the network map, which will allow you to clear the DB for a given X500.

0
votes

Just for investigation, could you stop the node, then run

DELETE
FROM NODE_LINK_NODEINFO_PARTY 
DELETE
FROM NODE_INFO_HOSTS
DELETE
FROM NODE_INFOS
DELETE
FROM NODE_INFO_PARTY_CERT

on the DB the node is connected to, then starting the node.

If that works - it's pointing to a race-condition on node startup.