It looks like the failover together with IPsec tunnel is not working as it should.
I’m running version 00.07.07.1, but has tested with some other versions and same behavior.
To start of I have several RUT routers that is in IPsec tunnel network, wired WAN with metric 1 and mobile interface with a higher metric and failover activated on both interfaces.
All looks good in status view with wired wan as Online and mob1s1a1 is Standby. Has watcher on both interfaces with ping - but important, has no “Flush connection on” the mobile interface.
Still each drop in communication on mobile - that is on standby, sends stroke terminate to tunnel as below:
daemon.info ipsec: 10[CFG] received stroke: terminate ‘myTunnel-mtTunnel_c’
That cause the tunnel to disconnect every time there is a short blip in communication on mobile.
Has checked the mwan3 configuration but can´t find where this terminate commands come from?
It had been correct if there where a real failover and my primary interface had went down then the tunnel shall re-initiate, but absolute not when standby interface has problem.
Configuration wise in webGUI I believe I have tried everything and can’t get rid of this behavior, don´t want to mess with the builtin scripts that will break when updating. But all tips n tricks are welcome, has banged my head over this a while now?
The most common issue I always found is about lifetime at phase one and phase two, seconds/minutes/hours… People don’t look this and the tunnel not stay up or stay stable.
The tunnel is 100% stable when not using failover, lifetime on both phases are correct configured.
All the logs show on that this is an other service (strongswan?) that sends terminate, below rows always appears after each other:
daemon.notice netifd: Interface ‘mob1s1a1’ is now down
daemon.info ipsec: 10[CFG] received stroke: terminate ‘myTunnel-mtTunnel_c’
This is also funny, when mobile comes up again there is another terminate:
user.info mwan3track[15910]: Detect ifup event on interface mob1s1a1 (qmimux1)
daemon.info ipsec: 15[CFG] received stroke: terminate ‘myTunnel-mtTunnel_c’
I’m appreciate your effort here - but the problem isn’t in the IPsec tunnel. Its 100% working and stable tunnel without rutOS failover.
It has DPD, together local identifier, to re-initiate quickly if there is drops and tested separate to the failover and works great. But as soon I active failover and poor signal on mobile that drops sometimes, the problem starts.
The logs clearly shows on miss configuration or poor handling in the failover routine. Configuration I have gone thru all settings I can think of, with no luck. So we are left with poor handling in failover routine. So what I’m looking for is a way to go around this within the standard configuration, to keep the product in upgrade shape for future personnel that handle maintain phase in this project.
Apologies for any inconvenience caused. Our developers are aware of the issue and are actively working on a fix. The solution is planned to be implemented in the next firmware release, which should be available in the near future.