RUT951+ - Wireguard tunnel not rebuilding after a power loss and reboot of the router

This is a repost of the issue #17746 which was accidentally marked as solved while it is not yet so.

On the RUT951 POE+ I configured Wireguard as a client to connect to the Wireguard server at the office. Everything works fine and the WIreguard service is up … until there’s a power failure on the side of the RUT951. IWhen the power is restored, it powers back up and it says/seems the WIreguard is up and running. But there is no connection with the server although in the settings it says the Wireguard is up. (Even if I wait for about 20 minutes or more; Wtachdog interval at 5).

When I then shut down the Wireguard service and save it then restart the service and save again, after about a minute the Wireguard connection is back up and there’s again a link with the Wireguard server.

The watchdog seems to check if the service is up and running but not if the connection to the Wireguard server is build.

There has been an updated firmware 7.21.1 which was tested but that didn’t resolve the issue.

Is your ‘Endpoint host’ setting set as a FQDN e.g. a DDNS endpoint?

For the wireguard_watchdog script to automatically monitor peers, the peer setting MUST have a ‘Persistent keep alive’ set AND have an FQDN as an ‘Endpoint host’. If the ‘Endpoint host’ is an IP address, the watchdog will NOT monitor the peer’s connection and restart the interface when necessary.

There is a “Persistent Keep Alive” and is on ‘20’ for the moment. But the configured endpoint is an IP address, so it’s not a FQDN.

But I would imagine that if the router does a reboot (for what reason whatsoever) the Wireguard tunnel has to be set-up regardless of the Watchdog? :thinking:

I would expect the same as your thoughts on the matter. First port of call for me would be to reflash the firmware. If no joy, have a peer at the logs for any clues.

Some side notes …

When I look at the wireguard watchdog code, there is a max time threshold of ‘2.5 minutes since last handshake’, before the interface is reset. Given this, I assume you’ll want to keep the ‘Watchdog interval’ setting below this time. I use the value 1 (one) for the ‘Watchdog interval’ so I can get at least 2 attempts at a tunnel check before restart. The UI says that if you leave the interval blank (empty) it defaults to the value 1 (one) but in a previous FW version, I found that if I left it blank, it didn’t behave as expected.

And lastly, if your 951+ is always the tunnel initiator, then leave the ‘Listen port’ blank (empty) and on your office server, turn off any keep alive to your 951+.

Greetings,

@Mike thank you for your input.

@C.sec do you require any additional support or has your issue been resolved?

Best Regards,
Justinas

@C.sec : check the routes after a reboot with ip -4 route show

Do you see the one to the wg server ? If so, on which interface ?

@Justinas no the issue is still not resolved as I explained to Matas before.

@vogon I performed a reboot just now, I do indeed see 10.0.0.21/24 dev wg. So I think the route is persistent over a reboot. The Wireguard service is still running on firmware RUT9M_R_00.07.16.3. On the RUT9M_R_00.07.21.1 it didn’t work.

So please @Justinas put this on the bug list for next release if possible :grinning_face: . It seems the same bug exists as in this post: Bug Report: WireGuard connection fails after reboot on RUT241 firmware 07.20.1.

This is obviously wrong a route to a public IP address via the wan / qmimux0 / … interface is missing.