In a flurry of under 24 hours, six HP 3000 veterans chipped in advice this week to help a 3000 manager who's weathering poor network response times. All of the consulting was free, offered though the 3000's ultimate community resource, the HP3000-L mailing list and newsgroup.
Kevin Smeltzer, an IT Specialist in MPE Systems at IBM's Global Services group, said he was watching his development N-Class responses slip into unusable measurements. "Today was so bad that test programs could not stay connected to a Quick program," he reported at 4 PM yesterday. "Linkcontrol only shows an issue with Recv dropped: addr on one path. This is a known issue with some enterprise network monitoring software that sends a packet that the HP 3000 cannot handle. Even HP last year had no solutions for that issue."
Donna Hoffmeister, Craig Lalley, Mark Ranft, Tony Summers, Mark Landin and Jeff Kell all came to Smeltzer's aid in less than 24 hours. Hoffmeister, Lalley and Ranft work support and consulting businesses, but nobody wanted to collect any fee. Summers and Landin chimed in from veteran 3000 manager status. And Kell, well, he founded the 3000-L, and headed the System Manager's special interest group for years. Like the others, he's steeped in the nuances of HP 3000 networking.
So long as the 3000-L is running, no one has run out of places to ask for this kind of help. There has been a thread of 16 messages so far, back and forth emails with long dumps of NETTOOL reports, examinations of TCP timer settings (Hoffmeister wrote an article for Allegro about this on its website), and discussion of switch port settings. "Do I need to shutdown and restart JINETD or restart the network," Smeltzer asked this morning, "to have my TCP changes in NMMGR take effect?"
Lalley ventured a guess after a close reading of Smeltzer's reports:
How are your gateways defined? If you change the gateway
then you could try deleting the wrong gateway and see if it helps. I think you have a router broadcasting a wrong gateway.
Hoffmeister said the problems might be in the physical layer:
Did you change NMMGR before or after the reboot? If after, you're going to want to reboot again. Your packet loss is disturbing. I'd be suspicious of a physical layer problem.
Problems in the physical layer can be addressed by replacing parts, Mark Landin advised.
Could be a bad network cable or connector. Replace them.
Could be a bad network switch port. Connect the system to another port (properly configured, of course).
Could be a bad NIC. Swap them in the 3000 and see if the problem moves with the card.
Hoffmeister pointed back to the TCP timer issues.
On PCI (A- and N-Class) systems with 100bt cards, you're more likely to see 'recv dropped: addr' counts due to the way the card handles (or not, actually) traffic routed for a different destination.
Typically these counts are nothing to be concerned about. What is concerning are the TCP statistics. Retransmits are almost always a function of using the default (or otherwise messed up) TCP timers. Let's just say I've never seen a case where it's not.
You get the idea. Smeltzer, who's competent enough to provide all the needed reports to the 3000 community, is getting HP Support Center-grade assistance. And free. Better assistance, even, since he noted about the enterprise packet problems of 2010, "Even HP had no solutions for that issue."
This is why when our email link to 3000-L went dead for a few days (thanks, ATT) we got online to set up an alternative delivery address. More than 110 message this month devoted to HP 3000 techniques. You can sign on for the free help at 3000-L, or just read the advice, at the mailing list website: http://raven.utc.edu/cgi-bin/WA.EXE?A0=HP3000-L The NewsWire would never have gotten off the ground without 3000-L's networking with the community. Make that network one of yours, too.