This is to follow for the same issue in the old forum. Same backtrace:
root@lgr5g:/etc/config# gdb netifd /tmp/netifd.1691111243.1984.11.core
GNU gdb (GDB) 10.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-openwrt-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from netifd...
(No debugging symbols found in netifd)
[New LWP 1984]
Core was generated by `/sbin/netifd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00021680 in ?? ()
(gdb) bt
#0 0x00021680 in ?? ()
#1 0x0002934c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
I’ll compile a netifd with debug information to see what is going on.
make package/network/config/netifd/compile V=sc TARGET_OPTIMIZATION="-ggdb3 -O0" STRIP="/bin/true"
Now the backtrace is a little more helpful:
Core was generated by `/sbin/netifd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00021680 in wireless_device_hotplug_event (add=65, name=0x34322e31 <error: Cannot access memory at address 0x34322e31>)
at /home/fl/rutos-ipq40xx-rutx-gpl/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/netifd-2022-01-12-5ca5e0b4/wireless.c:1571
1571 /home/fl/rutos-ipq40xx-rutx-gpl/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/netifd-2022-01-12-5ca5e0b4/wireless.c: No such file or directory.
and wireless_device_hotplug_event(const char *name, …) is called with an invalid argument, here name = 0x34322e31.
The same is true for the caller : device_hotplug_event(const char *name, …)
As the stack appears to be corrupt it isn’t easy to go further except trying to infer the name of a plausible caller.
netifd_handle_dev_hotplug() seems to be a good candidate, it contains one local variable named tb which an array, manipulating it without enough precautions can corrupt the memory around it.
Next step: add traces there.
Next coredump for netifd compiled with debug information:
(gdb) bt full
#0 device_release (dep=0xa21c78)
at /home/fl/rutos-ipq40xx-rutx-gpl/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/netifd-2022-01-12-5ca5e0b4/device.c:560
dev = 0x34322e31
__func__ = <error reading variable>
#1 0x00029498 in interface_flush_state (iface=0xa21bb0)
at /home/fl/rutos-ipq40xx-rutx-gpl/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/netifd-2022-01-12-5ca5e0b4/interface.c:281
No locals.
#2 0x0002d394 in interface_main_dev_cb (dep=0xa21c30, ev=<optimized out>)
at /home/fl/rutos-ipq40xx-rutx-gpl/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/netifd-2022-01-12-5ca5e0b4/interface.c:437
iface = 0xa21bb0
#3 0x0001f8b8 in device_broadcast_cb (ctx=<optimized out>, list=<optimized out>)
at /home/fl/rutos-ipq40xx-rutx-gpl/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/netifd-2022-01-12-5ca5e0b4/device.c:485
dep = <optimized out>
ev = <optimized out>
__mptr = <optimized out>
__mptr = <optimized out>
#4 0xb6eb33fc in ?? ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)