24 January 2018

Cisco ACI APIC and Spine/Leaf Upgrade Process

I'm just getting started with ACI in general. Here's the general process to upgrade the APICs/spine/leaf.

First check out the Cisco document here. In case the link moves/etc the document is "Cisco APIC Management, Installation, Upgrade and Downgrade Guide". This is essentially an abbreviated version of that document.

Of particular importance in that link is the section around  "Supported Upgrade Paths for APIC Controller and Switch Software" and associated downgrade section. Make sure you can jump from where you are to where you want to go. If it ain't listed, it ain't supported... and prepare for headaches.

The basic process is:

  1. Get files from Cisco onto a HTTP/SCP server and then uploaded to APIC
  2. Get APICs upgraded
  3. Wait for things to stabilise.
  4. Get Leaf/Spines upgraded
  5. Wait for things to stabilise.
This whole process took a few hours to complete... but I was gifted with having a fast internet connection to download files/etc with. I do the above using the (wimpy) GUI methods but the linked document lists ways to do the same using REST/CLI/Console/etc.

Getting the files from Cisco, to an intermediate HTTP/SCP box and onto the APICs
I couldn't believe it when I downloaded them but the files are gigantic. There is basically two main bits of software to get; APIC and ACI switch software. Thankfully Cisco put the matching APIC and ACI versions in the same sub-heading/version "Application Policy Infrastructure Controller (APIC)" on their download site. Basically if you click on 3.0 it has the APIC version (3.0.1 in my case) and the leaf/spine version (13.0.1) in the same section. In my case (going from 2.2 to 3.0) the files were:
  • aci-apic-dk9.3.0.1k.iso - For APICs 
  • aci-n9000-dk9.13.0.1k.bin - For ACI Leaf/Spine
Grab them from Cisco per normal process. Upload them to a HTTP/SCP server. 

In APIC, create a "Download Task" (Admin > Firmware > Download Tasks) point to each file individually. Once the task is created the file will be downloaded to APIC. You can see the status under the "Operational" tab of this page.

It looks like you can upload files directly onto the APIC from the GUI as well now (I didn't try that here though). This looks to be done through "Firmware Repository" under Admin > Firmware > Firmware Repository and clicking the "Upload Firmware to APIC" action.

Upgrading the APICs
In Admin > Firmware > Controller Firmware you'll have an action to "Upgrade Controller". Select the version/scheduled/etc and off you go. The screen will update the Upgrade Progress status bar for each APIC. The system will do one APIC at a time automatically/etc so just sit back and let it do its thing.... which brings me to...


Waiting for things to Stabilise
Just note that during this waiting time APICs will reload. This is non-disruptive as APICs aren't involved in production traffic but are only used to push policy to nodes/etc. This was a good 10-30min process for me. Had to reload the APIC browser session after they rebooted as well.

The APICs all appeared in the "Controller Firmware" screen as being "Upgraded Successfully"

Upgrading the Leaf/Spines
Similar to the APICs, except that you are going to be potentially impacting production traffic if things go bad. Basically under "Firmware Groups" in Admin > Firmware > Fabric Node Firmware > Firmware Groups create a group of AllNodes and select the ACI version you want to go to. 

Before worrying about doing all at the same time... just keep in mind the next bit is to create Maintenance Groups whereby you dictate which switches to upgrade at the same time. under "Maintenance Groups" in Admin > Firmware > Fabric Node Firmware > Maintenance Groups. Make a primary and a secondary maintenance group of nodes.

You kick off the upgrade by clicking "Upgrade Now" action of the primary Maintenance Group... then you wait patiently for things to come back and do the same for the secondary group. \

Based on the link (I've not tested though):

  • Up to 20 nodes are upgraded at the same time
  • Only one member of a VPC peer is ever upgraded at the same time (nice!)

Waiting for things to Stabilise
Up to 12 minutes is the estimate on how long it will take in the guide... be patient. The nodes will come back and things will be good (hopefully). During the upgrade process nodes will reboot and production traffic will experience some minor disruption provided everything is dual-mode connected.

Obviously take this all with a grain of salt... I am not an ACI expert but wanted to write some notes to summarise the wordy Cisco process. Some of my colleagues have screwed this up in the past and managed to get things going again (albeit onsite) using some of the other methods (i.e. CLI/etc).

Good luck! Hope this helps...