What's Your Duplex?



  • This summer I took a position with my local university as a web developer.  Their were plenty of WTFs associated with the website already, but the best came from the head network admin.

    We had finished migrating our servers from our department to that of the main university IT, and my access to the web server now consisted of two windows shares--the web directory on the test and production servers.  I had also managed (with a lot of convincing) to get them to setup CVS, since our hack-job Perl script for pushing changes forward was far from ideal.

    The problem was, CVS was horrendously slow.  With 2GB of data in 1000 files, doing an update (without any changes, mind you) took over half an hour.  A simple 'find > /dev/null' took 20 minutes.  A full checkout onto one server took nearly 8 hours!

    After a list of WTFs trying to figure it out with the IT department, I get to talk to the network admin, who says, "Have you checked your duplex setting?"  Now, I haven't worked with many large networks, but I know enough from small networks that any modern router/switch/hub handles duplex automatically; surely this couldn't be the problem.  I tried to argue, but he refused to try something else, saying, "Whenever we have a problem with the network, that's usually the problem."

    Fine, I check and my computer is set to Auto, as it should be.  But why is it currently running at half-duplex?  I inform the network admin and he says, "We have that wall plate set to half-duplex.  Set your computer to full and we'll set the wallplate to full."  Of course, I say, "Why not just let it select it automatically?" to which he replies, "It's not that simple."  WTF?

    Perhaps I missed something (that he refused to explain to me, as another IT person), but since when is it "not that easy"?

    I later found out from our own department that, because of certain hardware causing problems, the wallplates all over campus had their duplex set statically several years ago.  Since then, they were all supposed to have been switched back to auto, but some were missed.  Of course, he wasn't setting this one to auto either, just keeping with the WTF trend.

    Any network guys want to back me up here?  Please tell me there isn't some obscure reason for manually setting the duplex. 



  • [quote user="RevEng"]

    Any network guys want to back me up here?  Please tell me there isn't some obscure reason for manually setting the duplex. 

    [/quote]

    Perhaps the auto setting doesn't work properly. That's the only halfway sensible reason I can see... and it's still only halfway sensible.

     



  • Alas, auto negotiation doesn't always work perfectly, as your network admin seems to have experienced several times before. Still, it should obviously be the first thing to try, especially when connecting a new host.

     



  • First thing we do with our new switches is disable un-used ports and lock the rest to 1000, 100 or 10 at either full or half duplex depending on what is plugged in. While auto negotiation has gotten a lot better it is not perfect. Jet direct cards are the first that come to mind, they need to be 10 half or you get errors.



  • [quote user="RevEng"]

    Any network guys want to back me up here?  Please tell me there isn't some obscure reason for manually setting the duplex. 

    [/quote]

    Speed/duplex negotiation doesn't always work right.  These days, the problem is usually that two 100Mb full-duplex cards negotiate a connection of 10Mb half-duplex, but historically, the most common problem was that one card was only capable of 10Mb half-duplex and the other didn't negotiate speed.



  • I fail to see how this will have a huge impact on CVS performance. I'm sure it'll help somewhat, but if setting the nic to full duplex changes the checkout time from over half an hour to just minutes, then there's something else wrong. The setting doesn't increase bandwidth in any direction, just simply lets both input and output operate at the same time. If input + output was taking 30 minutes to do a checkout when at half duplex, then changing it to full will simply at best reduce the time it takes to just one, assuming that it isn't relying upon a trigger from the other stream. I'm thinking it'll now take 15 minutes if you are lucky. If it takes less than that, then there was another issue.



  • [quote user="RevEng"]Any network guys want to back me up here?  Please tell me there isn't some obscure reason for manually setting the duplex. [/quote]

     We were using some shitty network cabling and hardware for a while.  If you let everything autonegotiate to 100 megabit, full duplex, the network went to shit.  It'd take 2 minutes to load google.com.  Manually set your PC to 10 megabit, half-duplex, though, and suddenly google.com loads in a matter of seconds.

    I'm not quite sure why exactly that was happening, but the empirical evidence was undeniable.

     

     

    At any rate, goatcheez is probably right.  And in my experience, CVS is just slow on large filesets, period. 



  • I can understand where they're coming from. At my old job they had almost the exact same problem... took them months of fiddling to find out that there was one router or something in a wall that didn't really do auto-negotiation on duplex the right way. I think they figured out that it was an older, buggy Cisco router/switch/hub. They eventually fixed it, but tearing apart network closets for a whole 3 users who simply have to set their duplexing statically to 'full', thereby solving the immediate problem while they wait for the new parts to get ordered, approved, requisitioned, etc. etc. is actually not a WTF.



  • [quote user="Carnildo"]


    Speed/duplex negotiation doesn't always work right.  These days, the problem is usually that two 100Mb full-duplex cards negotiate a connection of 10Mb half-duplex

    [/quote]

    That is a sure sign of a dead cards or a bad cable. It indicates that the autonegotiation failed, which is almost invariably caused by bit errors in the packets used for this purpose.

    Anything that breaks the negotiation will almost certainly screw up regular network traffic as well (dropped packets and random bit errors - typical result, Windows behaves a little more slowly and erratically than normal, but the user doesn't think to report this as a fault). Identify the faulty component and replace it.

    Keeping everything set to autonegotiate and tracking the status of all the ports on the switch is an effective way to catch such problems early. Then your only problem is hardware that is over ten years old (again, watching the status of the ports is a good way to track such hardware down and cause it to fail abruptly, so that it can be replaced).  



  • [quote user="GoatCheez"]I fail to see how this will have a huge impact on CVS performance.[/quote]

    I agree, it confused the hell out of me.  Perhaps with the wrong settings there were collisions to hell?  The part I really didn't get was that downloads would work fine otherwise.  It was only when I did something extremely bandwidth intensive for more than about 10 seconds that it would happen.  I checked the network monitor and it would burn along for 10 seconds or so, then stop dead for nearly 10 seconds before sending a single packet, then another 10 seconds, etc.  The transfer would just completely die with about 8b/s getting through.  It was extremely confusing.

    Otherwise though CVS handles rather well given our gigantic dataset.  I've managed to prune it down to 900MB (part of the redesign was removing all the unused files) and I can do a checkout in about 2 minutes and it only takes about 30 seconds for it to run through the entire tree for an update.  Keep in mind that this CVS server is part of our campus-wide LAN; transfers > 10MB/s are quite common. 

    Now if only I could figure out why SSH hangs for almost 10 seconds before the server responds.  Makes multiple operations a bitch (typically longer waiting for SSH to connect than to perform the actual operation). 



  • We've discovered many auto-negotiation issues with various vendors... 3Com cards come to mind as being particularly unfriendly with Cisco 3xxx switches, for example. It's gotten much better with gigabit parts (provided your cabling is up to snuff... don't skimp, get the cat-6 stuff because it's worth it).

     Your SSH is taking too long to connect because the server is trying to reverse-lookup your developer workstation, and it's probably a DHCP machine without a corresponding DNS entry so that's going to fail and timeout.

    Have the server admin disable remote host reverse lookups (set UseDNS to 'no' in /etc/sshd_config)

     



  • Thanks hk0. Come to think of it, I remember reading about that problem a long time ago.  I'll be sure to pass it on to them.  It would definitely save me a lot of time and boredom.



  • Alas auto very often doesn't work. I work in an IT department in a University ans we often have to hard-code wall plates to full or half duplex depending on the situation. Dell servers running Linux seem to have real problems negotiating with our Cisco switches. I think every port in our server-room is explicitly set at this stage because when the duplex goes wrong on a a production server the result in not pretty and generally results in a lot of annoyed users.


Log in to reply