Report What happened to Head-fi? in Off Topic Posted November 12, 2007 This forum is completely fucked up...and I like it! But I still don't get it...I'm sure I will eventually. Nick, here's what Jude posted on the Head-Fi Facebook Group page: Here's a status update (which I will also post on the error screen for those who visit http://www.head-fi.org): This is obviously our worst outage in the history of Head-Fi.org. What happened was that we had Head-Fi.org's files and backups moved to a multi-terabyte network attached storage (NAS) unit while we continued to work on the proper implementation of a true clustering configuration for Head-Fi.org. From what we can tell, this particular NAS unit--with a reputation for being ultra-reliable--had one of its 12-channel RAID controllers malfunction. This particular NAS unit is a 24-drive unit, made up of two 12-drive arrays, each array with two parity drives (RAID 6). Maybe we put too much faith in it, but we thought were safe housing everything on it for the time being (the last several months). From what we're being told, when the controller card malfunctioned, it messed up the NAS unit's logical volume, which is where we're at now. We are working closely with the vendor and the technical support team in Europe to restore the logical volume and get the NAS back up again. We feel reasonably confident we will be able to restore Head-Fi to its state just before its outage, but won't know for sure if we'll have to fall back to a back-up, of which there are several on the NAS. Unfortunately, the only off-NAS backups we have of Head-Fi.org's databases are quite old, meaning we'd lose thousands of posts, so I will not put Head-Fi.org back up until we know for sure the status of the logical volume restoration. All I can do is apologize for this very extended outage, and for not having more recent off-NAS backups. Again, we hope to have the site back up and running fully tomorrow evening, but I will keep you up to date if that changes. The repair was well under way today when the repair process ran out of RAM (the NAS had four gigabytes of RAM). Since, for a number of reasons, the repair process was being run almost entirely from RAM, the four gigs was apparently not enough). I have ordered 16 gigabytes of RAM, which will arrive tomorrow before noon EST, whereupon the team in Europe can commence with the remotely administered repair process(es). The process has gone slower than we anticipated, and running out of RAM today was an unfortunate setback. But, once again, 16 gigabytes of RAM (in the form of eight DDR2, ECC registered 2GB cards--versus the four 1-gigabyte cards in there now) should be arriving in the morning. whereupon we call our friends in Europe and they finish the repair work. We already know some data was lost, but hope and pray that what we do retrieve will be enough to let us get the site back up tomorrow evening. We know we should have been more diligent about keeping more backups off the NAS, but running two 12-drive arrays, each array in RAID 6 (two parity drives, for a total of four)--and the fact that our previous NAS units ran without problems for seven years--we felt we were safe in keeping them there until we were finally through with the proper clustering we've intended for months. All I can do is apologize for this very extended outage, and for not having more recent off-NAS backups. Again, we hope to have the site back up and running fully tomorrow evening, but I will keep you up to date if that changes.