Recently I deployed a number of vRealize Automation blueprints that made use of VMware NSX on-demand networking. To ensure virtual machines could route traffic back to the
production network, I deployed a Distributed Logical Router (DLR) and Edge Services Gateway (ESG) and configured the routing accordingly.
Initially this all went well. VMs deployed to the management cluster could route out all the way to the physical network and beyond. However workloads deployed to the compute cluster couldn’t. They could communicate with each other, just not the DLR or ESG.
It was then that I noticed that the VTEPs were deployed on the correct dvSwitch on the compute cluster, but not the management cluster.
The solution was to move all NSX portgroups (logical switches and VTEP uplinks) from one dvSwitch to another. However, you first have remove anything and everything that makes use of these.
Fortunately I had no production workloads using virtual networking, except for the DLR and ESG. Configuring the North/South routing up to my Cisco ASA HA pair using OSPF took a while to get right, so while redeploying Edge Services is a trivial matter – the configuration can take time.
VMware advises against backing up the Edge Services VMs directly. In the event of a failure you would normally opt to redeploy straight from the GUI. But how do you completely remove them whilst keeping (and reusing) the config?
Wait Mr Postman
This is where the REST API comes to the rescue. If you’re not familiar with using REST with NSX then I urge you to take a look at the brilliant documentation VMware have produced. This can be found at https://pubs.vmware.com/NSX-62/topic/com.vmware.ICbase/PDF/nsx_62_api.pdf.
There are a number of REST clients out there, but my favourite is Postman. There’s a version for Mac and a Chrome Plugin, plus a version for Windows if you know anyone who still uses that…
Open Postman and select the GET request method and configure the following request URL:
https://NSX Manager IP/api/4.0/edges/
Change the Authorization type to Basic and enter your NSX Manager admin credentials, then click Update Request:
Click the Headers tab and add a new key called Content-Type, and set the value to application/xml:
Click Send. If you fail to get a response one reason is likely to be SSL certificate verification. Disable this in Settings to proceed.
If everything works well you should receive a raft of XML data in return:
This will return all Edge Services deployed in NSX. Locate the ones you aim to redeploy. In my example I need to ensure I have backed up both the DLR and ESG. In the XML, these are identified by the edgeType tag:
Use the GET method again, but append the edge ID to the request to get the XML data for each appliance you wish to redeploy, eg:
Save the XML data as we’ll need this later on.
Blow it away and start again
At this point I need to remove my DLR and ESG to progress with modifying my NSX settings. I could use the GUI to do this, but as we have the REST API open why not use that?
Remove each device by changing the method to DELETE and using the following URL:
https://NSX Manager IP/api/4.0/edges/edgeId
If successful you should get a 204 No Content result.
Time to redeploy
Paste the XML data you saved previously for each appliance into a text editor. Locate the cliSettings tag, and insert a password element, eg:
In your REST client of choice, change the method to POST and use the following URL:
https://NSX Manager IP/api/4.0/edges
Paste the XML into the body and click Send. If the appliance deployed correctly you should receive a 201 Created status message.
After removing my Edge Services appliances I removed my management cluster from the Transport Zone. I then reconfigured the cluster so that all NSX VTEPs were deployed to the correct dvSwitch, and then re-added the cluster to the Transport Zone.
Finally I redeployed the Edge Services as shown above. After that everything worked as expected.