What’s that burning smell…

Why do the problems always occur to the company’s major-important systems?

So today, We have fixed another problem with our (PITA) “Case Management System”

but for once, it wasn’t because of Oracle.

(I’m sure the picture to the left has already given you a clue where the fault lay)

So, rolling back a bit to January we moved the back-end of the case management system to an ISCSI SAN. We used a managed Netgear switch due to Cisco’s own 2960G range being “in constraint” and taking 2+ months to be delivered.

This all ran fine until Wednesday night, when we finally received our two new Cisco Catalyst 2960G-24 switches for the ISCSI Lan. In they went and reconfigured courtesy of the 3rd party company Novosco (I should really name-drop them here, because they were very helpful throughout )

Thursday morning, our system was off-line for some unexplainable reason. We got them involved where some re-jigging went on to get it back up and going on only one server. Our CMS is clustered for disaster situations 😆

Friday night, they were back in with us looking at the over-arching infrastructure… and the top cisco switch was factory reset to start reconfiguring it, only to discover that the unit had failed entirely. Thats when the lightbulbs came on!

This morning in went the old Netgear switch which was brought back in, and the iscsi san came up first time. The bottom cisco switch was tested and it came back all OK as well… We had received a faulty top switch 😥
What didn’t help was that the top switch was the “master” switch of the pair, it explains why fallover and LACP was screwy and things didnt work.

So the faulty switch is on its way back to Novosco’s office with the engineer, and the system is up and operational now, ready for them to use it again on Monday morning using only one switch.

The only question now is, how long until something else breaks and takes down the CMS 🙄 It better not be the other bleeding cisco switch!

What’s that burning smell…

Colin

Disclaimer