All because I dropped a box in the garage.



  •  I suppose thats a reasonable title for what happened to my home server.

     This server is a Linux box hosting amongst many other things, a little server daemon which takes the output of a marine VHF AIS data receiver incoming on a serial port, and relays it to as many telnet clients as wish to connect.

    It sits on port 8080 on the firewall. If you connect to it then it simply spits out a sequence of timestamped NMEA strings as they are picked up from the ships in the harbour near my house.

    I forward this data to an AIS service (AISHub) which then uses the data to provide shipping maps.

    The daemon occasionally goes around checking for zombie TCP/IP connections and removes any server threads dealling with these.

    This had accomodated the behaviour of the client at AISHub which  would connect and then disconnect if it got no data after a few seconds.

    When the power fails on the AIS receiver then the no data disconnect-reconnect sequence ticks along nicely.

     

    AISHub occasionally change their client software without telling anybody...  and then this  happened.

    The other day I dropped a box on the power cable to the AIS receiver. It turned off. As it is in the garage I didnt notice the LED on the front had gone off.

    Two days later the Linux  server went strange. Served up blank web pages, failed to accept any files copied to it.

    Checking the LVM partitions showed the system partition was full. Checking the logs showed the system and user logs were filling the disk partition - 5 gigabytes of error messages....

    These showed the external AISHub client was now reconnecting every few seconds without dropping the previous connection when it failed to receive data in time.

    So the daemon couldnt kill the old connections because they were still live.

    Eventually the system ran out of file handles and following that the system log was gigabytes of 'no more file handles' errors. 

    Deleting the logs, reconnecting the power to the AIS receiver and restarting the server seems to have made the system stable.

    Now I think I need to insert a policy to limit the number of simultaneous connections from any IP address to  some small number nearer 1 than 10000.

     

     



  •  Sounds like a bug in the AISHub software. I can not see a reason why they would want to keep all the connections.



  • @mikedjames said:

    reasonable title

    Was the title randomly generated? Should I link to xkcd?



  • @serguey123 said:

    @mikedjames said:

    reasonable title

    Was the title randomly generated? Should I link to xkcd?

    Only if you want a Rosie onslaught...



  •  Not quite random. But sleep deprived.

    Had spent most a night evolving strategies in my head for getting my 1974 VW camper van fixed in time for a drive to a VW show next Saturday. 

    Answer after hours of thought =  take it to the local VW expert.


Log in to reply