Have you recently upgraded FortiGate and noticed after a week that Memory Utilization is much higher than before? You might be facing a memory leak.
I'll describe that issue using example process: WAD - responsible for Explicit Proxy and WAN Optimization.
You have noticed that RAM usage is much higher than expected. How to drill down which process is eating all memory?
Remember, that FortiGate is a Unix box in general. Some troubleshooting commands are ported to FortiOS. I'll use htop:
FGT# diag sys top-summary CPU [|||||||||||| ] 30.7% Mem [|||||||||||||||||||||||||||||||||||| ] 90.3% 1761M/1863M Processes: 20 (running=2 sleeping=134) PID RSS CPU% ^MEM% FDS TIME+ NAME * 153 451M 0.0 24.1 461 53:57.80 wad [x10] 136 367M 0.0 19.7 442 01:32.16 ipsmonitor [x6] 156 46M 100.8 2.5 22 19:51.69 updated [x2] 133 42M 0.8 2.3 25 02:49.33 httpsd [x4] 155 31M 0.0 1.7 47 15:39.36 scanunitd [x5] 107 29M 0.0 1.6 14 11:25.40 cmdbsvr 143 29M 0.0 1.6 25 56:54.89 forticron 31121 24M 0.0 1.3 12 00:00.86 pyfcgid [x4] 183 21M 0.0 1.1 26 10:12.94 cw_acd (...)
In my example, wad process (10 instances of it to be specific) is eating around 24% of memory. Depends on the config and appliance total memory, it can eat even 90% of RAM if the memory leak is present.
NOTE: I ignored ipsmonitor because in my configuration 20% of memory utilization is constant for that process and it was a similar value before an upgrade.
I prefer top-summary compared to top, because it summarize all instances of a single process, which gives you a nice overview how much RAM each feature is using.
Check if there is new software available. Usually, memory leak issues are quickly traced down and fixed with the next minor upgrade. Read the Release Notes! If you are running the latest avaiable software the main branch, there are 2 options:
- downgrade - I try to avoid it, as it's a messy solution
- schedule auto-restart of a process
I'll focus on the second solution. In many cases, you can use it untill a new software version is released.
I'll write a simple script that is executed every 12 hours:
config system auto-script edit restart_wad set interval 43200 set repeat 356 set start auto set script 'diag test app wad 99' next end
That script will automatically, every 12 hours, restart a wad process. Simple, but effective. Remember to remove it after a software upgrade to the verison which resolves this bug.
If the bug is present in the latest available software, make sure that you open a case with Fortinet Support. If they are not aware of the bug, it won't be fixed.
That procedure applies to all other processes. If you can't restart it using diag command, try killing a process (FortiOS should automatically restart it):
FGT# diagnose sys kill 9 <pid>
- <pid> - replcae it with PID number obtainted from top or top-summary output.