We have dropped what was supposed to be the new core at our newest datacenter and rolled back to an old router in the 800 s hope datacenter. Cisco techs are helping us try to diagnose the core - we're uncertain what's happening, but we suspect there may be a faulty distributed forwarding card on our 6704 10ge blade.
The current setup should keep you online, we appologize for the outages but this is an extreme case of apparent hardware failure.
Last night our new fancy upgraded core in our newest datacenter experienced a very unexpected hardware failure.
Initially technicians thought we may be dealing with a case of damaged fiber between buildings, as the network was experiencing intermittent lag and packet loss despite no DDoS events. Some IP ranges were fine while others were almost completely inaccessible. As techs cycled through the fiber pairs it became clear that this was not the case.
Additional staff were paged to try and diagnose the problem - after trying literally everything under the sun we began the process of migrating core switching back to the 800 s hope datacenter to try and resolve the packet loss some customers were still experiencing. All traffic is now switching out of the old core as we go over the problem with Cisco Techs. At this time it appears a Distributed Forwarding module for the 10GE blade connecting the new core to the old datacenter is the culprit.
We appologize for any issues this may have caused you or your customers, but we do not expect any more network outages while techs replace the faulty device. If we encounter a similar problem in the future techs know to force the other DCs into direct routing mode rather than experiencing lag.