Do you know, that using FortiManager, you can promote a slave to become a master in your FortiGate HA Cluster using just a one GUI button? It's a Promote button, which you can find in a Device Manager >> System Dashboard section.  Very convenient. I'll explain how it works and why it can throw out an error.

How does it work?

Let's switch to FortiGate itself for a moment. How do you failover a cluster? Usually directly from a command line, executing a command on a master:

diag sys ha reset-uptime
Listing #1

What does it do? To explain that, we have to understand the election process, which is as follows:

  1. Connected Monitored Ports
  2. Age
  3. Device Priority
  4. Serial Number

Connected Monitored Ports

Regardless of applied settings and uptime, a unit that has more monitored ports in a connected (up) state, will always be selected as a master.

Age

Then a cluster age is taken under consideration. The age is not an uptime. It's the amount of time since a monitored interface failed or was disconnected. Of course, a reboot of a unit resets that timer. Same with the reset-uptime command which is listed above. However, there is a 5 min (configurable) grace period. If the time difference is less than 5 mins, during an election process, there is a par and algorithm proceed to the next step, which is:

Device Priority and Serial Number

Device priority == higher is better. Simple as that. If equal, then Serial Number decides. The unit with greater S/N becomes a master.

Override

Worth mentioning is the override setting.

config system ha
  set override enable
Listing #2

Once enabled, it changes the decision tree:

  1. Connected Monitored Ports
  2. Device Priority
  3. Age
  4. Serial Number

Device Priority takes precedence over the Age. It might be useful if for a particular reason you want to ensure that one unit is always preferred as a master. It might be a good idea if cluster members are in different physical locations and one is generally a backup data center.

Are there any disadvantages of that? The biggest one is that even if a unit is unstable (faulty power supply, flapping port, etc), it will always be selected as a Master, which might cause a lot of unnecessary cluster failovers. So it's crucial to have proper monitoring and cluster changes alerting set up. It can be configured with:

  • FortiAnalyzer
  • FortiGate's Security Fabric Automations

Promoting master with FortiManager

In the FortiManager GUI, once you open a FortiGate dashboard view, there is a widget with all cluster members and the possibility to promote a backup unit:

Once you click that button, there's a big warning (who's reading that?!) and ond3 you confirm, units will change roles. Beautiful! However, be careful with that, as using this button can change the cluster behaviour, which might be dangerous.

Let's assume that you have an AP cluster with below settings:

config system ha
    set group-name "fortinet"
    set mode a-p
    set password TOPSECRET
    set hbdev "internal7" 0
    set session-pickup enable
    set ha-mgmt-status enable
    config ha-mgmt-interfaces
        edit 1
            set interface "internal6"
            set gateway 10.10.10.1
        next
    end
    set override disable
    set priority 200
end
Listing #3: FortiGate #1 before PROMOTE

The second unit has a similar configuration, but priority is set to 100:

config system ha
    set group-name "fortinet"
    set mode a-p
    set password TOPSECRET
    set hbdev "internal7" 0
    set session-pickup enable
    set ha-mgmt-status enable
    config ha-mgmt-interfaces
        edit 1
            set interface "internal6"
            set gateway 10.10.10.1
        next
    end
    set override disable
    set priority 128
end
Listing #4: FortiGate #2 before PROMOTE

You promote a FortiGate #2 to become a master, and later on, you promote back a FortiGate #1 to become a master. So in terms of HA roles, we are in the starting position. Let's display HA settings on both units one more time:

config system ha
    (...)
    set override enable
    set priority 130
end
Listing #5: FortiGate #1 after 2x PROMOTE
config system ha
    (...)
    set override enable
    set priority 129
end
Listing #6: FortiGate #2 after 2x PROMOTE

For clarity, I omitted settings that stayed intact. As you can see, 2 settings are adjusted under the hood:

  • priority values - which shouldn't be a concern
  • overwrite parameter - if you are not aware of that, it can be dangerous and can lead to unnecessary fail-overs

Based on the above test, I don't recommend using that feature for cluster fail-over, a much better option is to execute a reset-uptime command (Listing #1) directly from a master unit.

PROMOTE is failing

Now, once you understand the process under the Promote button, let's try to figure out why it can fail with an error:

Failed to change the HA device sequence

It's not obvious, as other than that, FortiManager is working fine. You can push settings and policies, retrieve config from FortiGates, all status indicators are green. So why it's failing?

As you know, for communication between FortiGate and FortiManager, responsible is FGFM protocol which is well described in the FortiGate / FortiManager - Communications Protocol Guide. The most important part in our scenario is the fact, that communication is based on certificates, so it's independent on user accounts and their password. But not for HA manipulation. I verified this with Fortinet Support, and providing a correct password for FortiGate in the FortiManager settings is critical for this functionality to work. So make sure that you provide proper credentials for a super-admin account

I hope that this article clarifies the HA election process and unveils a mystery of a Promote button in the FortiManager GUI. Happy clustering!