Sodog enters unrecoverable CPU loop when modem USB reset occurs (TRB245, RutOS 00.07.14.5)

Device: TRB245
Firmware: RutOS 00.07.14.5
Service: RS Over IP (sodog daemon)


Description:

When the modem (Quectel EC25) undergoes a USB reset — a normal event in LTE operation — the sodog process managing a serial-over-IP profile enters an unrecoverable busy loop, consuming 46% CPU indefinitely. The router becomes unresponsive on the affected serial port and cannot recover without manual intervention.

Root cause chain:

The modem (ttyUSB2–5, USB 1-1.4) and the RS232/RS485 adapters (ttyUSB0/1, USB 1-1.2/1-1.3) share the same USB hub. When gsmd fails to communicate with the modem (errno 5 / EIO) and triggers a modem USB reset, the resulting bus disturbance corrupts the serial adapter’s I/O. sodog then enters a tight error-read loop on the affected ttyUSB device and never recovers.

Observed symptoms:

  • sodog process at 46% CPU, sustained indefinitely
  • logread consuming 38% CPU due to log flooding from sodog
  • Load average: 2.61 (normal: <0.3)
  • TCP receive buffer on affected port: 51 bytes stuck, no data forwarded
  • No active TCP connections possible on affected serial port
  • Field devices alarming due to loss of serial communication

Relevant log entries:
daemon.info gsmd: [CMM] Write to modem failed: errno 5
daemon.info gsmd: [MODEM_MANAGER] Modem already exists and is registered!
daemon.info gsmd: [MODEM_MANAGER] Unable to initialize modem!
kernel: [01-serial-symlink.sh] New device ttyUSB2 appeared!

Recovery:
/etc/init.d/rs_overip restart
CPU immediately returned to normal (<10%).

Expected behaviour:
sodog should handle serial I/O errors gracefully — either by restarting the affected profile automatically or by entering a defined error state without monopolising the CPU.

Workaround applied:
Cron-based watchdog checking sodog CPU usage every 5 minutes, restarting rs_overip if any sodog process exceeds 50% CPU.