[Dibbler-devel] Dibbler client crashes if address is duplicate

Mircea Ciocan mirceac at gmail.com
Mon Jul 25 13:31:49 CEST 2016


Hello Tomasz, so I was able to reproduce reliably the problem, it is indeed
caused by the duplicate DUID and after fighting a bit with Ubuntu to
generate some useful core dumps (by default is configured in the most
developer unfriendly way) here is what I've got:

gdb /usr/local/sbin/dibbler-client -c /etc/dibbler/eth3/core
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /usr/local/sbin/dibbler-client...done.
[New LWP 16067]
[New LWP 16068]
[New LWP 16069]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

warning: no loadable sections found in added symbol-file system-supplied
DSO at 0x7ffdaa331000
Core was generated by `/usr/local/sbin/dibbler-client run -w
/etc/dibbler/eth3'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000000492db2 in TMsg::getSize (this=0x20f6f20) at Msg.cpp:71
#2  0x000000000042a92e in TClntMsg::send (this=0x20f6f20) at ClntMsg.cpp:349
#3  0x0000000000439f42 in TClntMsgRequest::TClntMsgRequest (this=0x20f6f20,
IAs=..., srvDUID=..., iface=23) at ClntMsgRequest.cpp:171
#4  0x0000000000412ce9 in TClntTransMgr::checkDecline (this=0x20f5af0) at
ClntTransMgr.cpp:1223
#5  0x000000000040dd63 in TClntTransMgr::doDuties (this=0x20f5af0) at
ClntTransMgr.cpp:479
#6  0x000000000040716d in TDHCPClient::run (this=0x7ffdaa325880) at
./Misc/DHCPClient.cpp:168
#7  0x00000000004061b0 in run () at ./Port-linux/dibbler-client.cpp:116
#8  0x000000000040667f in main (argc=4, argv=0x7ffdaa325ac8) at
./Port-linux/dibbler-client.cpp:182
(gdb)

The last client log messages:
2016.07.25 12:10:34 Client Notice    Network switch off event detected.
initiating CONFIRM.
2016.07.25 12:10:34 Client Debug     Failed to read sockets (select()
returned -1), error=Interrupted system call
2016.07.25 12:10:34 Client Notice    Network switch off event detected. do
Confirmming.
2016.07.25 12:10:34 Client Info      Creating CONFIRM: 1 IA(s) on eth3/23
2016.07.25 12:10:34 Client Debug     Sending CONFIRM(opts:1 3 39 8 6 ) on
eth3/23 to multicast.
2016.07.25 12:10:34 Client Debug     Sleeping for 1 second(s).
2016.07.25 12:10:34 Client Debug     Received 70 bytes on interface eth3/23
(socket=6, addr=fe80::3e97:eff:fe86:9a7d).
2016.07.25 12:10:34 Client Info      Received REPLY on
eth3/23,trans-id=0x558bc3, 3 opts: 1 2 13
2016.07.25 12:10:34 Client Notice    Address 2001:1:8::6/128 added to
eth3/23 interface.
RTNETLINK answers: File exists
2016.07.25 12:10:34 Client Debug     Not executing external script (Notify
script disabled).
2016.07.25 12:10:34 Client Debug     Sending DECLINE(opts:1 2 3 8 ) on
eth3/23 to multicast.
2016.07.25 12:10:34 Client Info      Sending DECLINE for IA(IAID=1)
2001:1:8::6 2016.07.25 12:10:34 Client Notice    Address 2001:1:8::6/128
deleted from eth3/23 interface.
[client log stops here]


On the ISC server side I got:
Confirm message from <local link addr of the client that crashes>
Sending Reply to <local link addr of the client that crashes>
Decline message from <local link addr of the client that crashes>
Client <DUID duplicate> reports address 2001:1:8::6/128 is in use by
another host!
Sending reply to <local link addr of the client that crashes>
[after this no other messages are logged]


If I can do any tests to help narrowing the search I will gladly do it.
I will now start looking into the code, maybe I'll find something.

Bast regards and waiting to hear from you, M.C.


On Fri, Jul 22, 2016 at 4:29 PM, Tomek Mrugalski <thomson at klub.com.pl>
wrote:

> On 22/07/16 11:45, Mircea Ciocan wrote:
> > Hello all, I have deployed the dibbler client in a moderately large
> > network using ISC DHCP server version 4.1.1 and for some weeks all
> > worked OK, but now I have the following problem:
> >
> > In some specific situations the dibbler-client dies without a trace, and
> > the only messages I have in the DHCP server log:
> >
> > Jul 21 09:59:24 dhcpd: Client XXXX releases address YYYY
> > Jul 21 10:00:20 dhcpd: Client XXXX reports address YYYY is already in
> > use by another host!
> > Jul 21 10:32:39 dhcpd: Client XXXX reports address YYYY is already in
> > use by another host!
> >
> > Afterwards, the dibbler-client suddenly dies, and the device is left
> > without any valid IPv6 address, thus is becoming unreachable :(.
> What do you mean by "dies"? Segfaults or terminates silently?
>
> This should be relatively easy to reproduce: configure your server with
> a very small pool of one address and then configure manually this
> address on the server's interface. The client will get this address and
> should trigger the DAD routine.
>
> > As far as my research was able to discover could be a situation similar
> > to this one:
> >
> > http://comments.gmane.org/gmane.network.dhcp.isc.dhcp-client/8957
> >
> > While having duplicate DUIDs in a network is a bad thing, but sometimes
> > this is happening and the client service should not crash but reject the
> > address, this seem to be a bug.
> True. Out of curiosity, what was the reason for duplicate DUIDs to
> appear? Cloned VM, cheap NICs that happen to have the same address or
> something else?
>
> > Tomasz or anybody that knows the code base organisation, could you
> > kindly point me to the C++ module that deals with this condition to try
> > to see what is happening and what leads to the crash ?
> The actual determination if address is duplicated is check in
> is_addr_tentative function in Port-linux/lowlevel-linux.c. You can grep
> *.cpp sources to find where this function is called from. The areas you
> should look at will are:
>
> AddrMgr/AddrIA.cpp:447
> ClntAddrMgr/ClntAddrMgr.cpp:371
> ClntCfgMgr/ClntCfgMgr.cpp:230 and 842
>
> If dibbler-client segfaults, you can set ulimit -c 9999999999 and then
> dibbler will dump the core, which can be inspected with gdb. If that is
> the case, please send me the backtrace of it (bt command in gdb).
>
> If dibbler quits silently, there may be another cause. Dibbler uses
> imported code from iproute tools that I don't fully understand. There's
> a lot of exit() calls in it. One possibility is that those pieces of the
> code are somehow triggered.
>
> Good luck,
> Tomek
>
> _______________________________________________
> Dibbler-devel mailing list
> Dibbler-devel at klub.com.pl
> http://klub.com.pl/cgi-bin/mailman/listinfo/dibbler-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://klub.com.pl/pipermail/dibbler-devel/attachments/20160725/5b57e76c/attachment.html>


More information about the Dibbler-devel mailing list