SNMP daemon crashes v7.06.3 and v7.06.6

Hi there

I’m having a problem with the snmpd crashing primarly on RUTX09 devices but not limited to them. I see the same behaviour on RUTX50 devices too.

The problem is that the parsing of SNMP information is so slow that the daemon crashes repeatedly leading to a timeout condition in the parsing service (in this case it’s Zabbix 6.4)

I’ve also tried interrogating from the SNMPB program on Windows and see the same behaviour.

Things I’ve tried:

  • Restarting the router
  • Reconfiguring the service
  • Upgrading the firmware from 7.06.3 to 7.06.6
  • Factory resetting to known good config

I know it is a service crash because using the logread command in the cli I can see that it attempted to restart the service 6 times before crashing.

I’m going to try a complete factory reset and re-import of a backup. Other than that I’m out of ideas.

Any input welcome
Thanks

Tino

Bit more info

It actually only crashes under SNMPv3 interrogation. SNMP1 goes slow, but the service remains up although it does timeout.

From logread |grep snmp

Tue Mar 26 14:19:32 2024 user.notice root: start_service snmpd
Tue Mar 26 14:20:03 2024 daemon.info procd: Instance snmpd::instance1 s in a crash loop 9 crashes, 31 seconds since last crash

Anyone?? Please…

More info.

It seems like the response of the SNMP daemon on the Teltonika is better with DES rather than AES128 encryption using SNMPv3. The service doesn’t seem to crash, but does timeout.

I’m using SNMPB as the program to test against various settings.

This behaviour is also present on RUT955 running v7.05 so maybe it isn’t specific to the 7.06 branch

Seems like no one is bothered about this bug or has encountered it. Teltonika can contact me if they give a rip!

Hello,

Apologies for a late response.
Could you clarify which MIBs are you polling to replicate this issue? I’ve tested some of them on the MIB Browser, but am unable to replicate the issue. Could it be specific to certain MIBs, or is this replicable on all of them?
Is Zabbix client also installed on the router? Or is it simply polling the router using SNMP?
Are there any custom scripts/services running on the device that could interfere with SNMP?

Best regards,

Hi Daumantas,

It is when I do a SNMP Walk, so it interrogates all available OIDs. There is no Teltonika MIB loaded per say within the SNMPB program other than SNMPv2-MIB I think

RE: Zabbix, I created a template based on the MIB files for RUT955 and X12 devices. There is no client installed on the router. All information is polled via SNMP.

Hello,

Checked Linux snmpwalk utility with RUT955 with the following arguments:

snmpwalk 192.168.55.1 -v 3 -c private -a SHA -A <passphrase> -l authPriv -u <username> -x AES -X <passphrase>

And here are the parameters in the WebUI:

These authentication and privacy algorithms are the highest supported. Running through full snmpwalk took around 9 seconds:

root@xxxx:~# time snmpwalk 192.168.55.1 -v 3 -c private -a SHA -A <passphrase> -l authPriv -u <username> -x AES -X <passphrase> 
...
iso.3.6.1.2.1.31.1.1.1.19.13 = Timeticks: (0) 0:00:00.00
iso.3.6.1.2.1.31.1.1.1.19.14 = Timeticks: (0) 0:00:00.00
iso.3.6.1.2.1.31.1.5.0 = Timeticks: (0) 0:00:00.00

real    0m9.323s
user    0m0.179s
sys     0m0.127s

With the Web interface closed, it knocked it down to 4.2s:

real    0m4.182s
user    0m0.162s
sys     0m0.162s

And with SNMPv1 it was even less at 2.8s:

real    0m2.796s
user    0m0.081s
sys     0m0.120s

Since RUT955 is an embedded device, I would consider this time acceptable, and no snmpd crashes occurred. Additionally, timeout values can usually be configured on software, that utilizes SNMP.
Could you clarify if these are the times that you are experiencing on your devices?
Perhaps you could try running snmpwalk on a completely fresh installation with only SNMP package installed (without uploading the configuration)?

Best regards,

Is all that on a local LAN Daumantas??

I should have added that the SNMPB walk was done from our HQ over a Wireguard connection to the remote site.

I have confirmed though that the Wireguard connection was solid throughout though with a continuous ping running.

I’ll do the SNMP walk command from a linux machine on the same LAN as the target RUT device and report back

Correct, these tests were done connected directly to the RUT955. Let me know how the tests go!

Best regards,

Hi Daumantas.

Doing the SNMPWALK locally yields quick results. Using the same details over the Wireguard VPN I get repeated timeouts. This is to an RUTX09 running v7.06.6.

SNMPB is the client I am using on Windows to do this remotely. It never used to show this timeout behaviour prior to me noticing this fairly recently (7.06 branch mainly)

Is there anyway you can try this over a Wireguard connection in your testing??

Trying snmpwalk from the zabbix server in HQ reaching the router over wireguard with the exact same command as used locally yields a timeout.

Router pings and traceroutes ok

and now the snmp daemon on the router has crashed

root@Teltonika-RUTX09:~# logread |grep snmp
Tue Apr 9 10:37:53 2024 kern.notice kernel: snmpd configuration has been changed
Tue Apr 9 10:37:54 2024 user.notice root: start_service snmpd
Tue Apr 9 10:51:00 2024 daemon.info procd: Instance snmpd::instance1 s in a crash loop 6 crashes, 76 seconds since last crash

If SNMPv1 (and v2 ?) works fine this may indicate an issue with the MTU of the wireguard interface, the client become unable to decrypt a fragmented response and timeouts.
What are the MTU of the wg and wan/mobile interfaces at both ends ?

snmpv1 and 2 seem to timeout too

MTU is 1420… Defaults, nothing fancy with jumbo frames

Hello,

Could you try lowering it to 1350 and check if the issue still persists?
Or perhaps a different VPN (e.g. IPsec) could also be tested?

Best regards,