Current (technical) status of the SunSITE: moving, moving, moving...

Moving lots of content to a newer machine, spring 2010

For speeding up downloads and to avoid overload (see below...) SunSITE is moving mirrored content to a newer hardware. Most of the bigger mirrors already moved to
vesta.informatik.rwth-aachen.de - you may notice that when being redirected there.
This is work in progress, hopefully completing in summer 2010 without too much disturbance for our clients. The content should be accessible as usual - if you encounter any problems, broken links or the like, please let us know.

Very high load because of various project's releases, Nov. 18th, 2009

Ubuntu's "karmic koala" was quite new when opensuse 11.2 was released last week. Yesterday Fedora 12 was opened to the public and today a new beta release of Knoppix 6.2 came out. Many downloaders try to get all of it and the "load" number of 200 and more on our two processors is the result... Sorry for the slow response times - we are currently offloading the first mirror content to a newer machine, which will hopefully be available in the next few weeks..

mozilla mirror deleted itself, Nov. 18th, 2009

Last night the whole content of our ftp.mozilla.org mirror was wiped out because of an error on osuosl.org where we get that content from. Currently resyncing...

eclipse mirror deleted itself, Aug. 17th, 2009

Over the weekend the whole eclipse mirror content disappeared. As this was a clean rsync run, the problem seems to be upstream on the main eclipse download server. Now the contents seem to be back on eclipse's main server, rsync run was started to get back the content.
Update: Aug. 27.
Up to now only half the volume of the eclipse mirror is back online on SunSITE CEUR. Resync runs very slow and the master server seems overloaded with the other mirror's attempts to get content back.

Hard disc problems, Aug. 14th, 2009

One of our RAID-Arrays lost a disk which showed write errors. While resyncing, another disk began to have read timeouts. After several attempts to get back into a redundant state, this disk also gave up so we lost the contents of this array. Both disks are replaced and we are waiting for the array to finish its initialization. As soon as this is done, we will start to re-mirror the former contents.
Currently unavailable: jpackage, extrarpm, mandriva, ctan, fox, fortran, procmail, delphi, gnoppix, kernel, rfc, gnu
Update: Aug. 27.
Most of the mirrors are back online. Three mirrors were discontinued: extrarpm, delphi and gnoppix didn't seem to be maintained any more, so they were dropped.

Hard disc problems, Jul. 3rd, 2008

Some of our hard disk drives reported problems to the system logs but are still alive. Disk I/O is incredibly slow. We try to fix this in cooperation with SUN support, they promise to call us back after analysing our logs on Mon., 7th July. Mirrors affected are: debian elmix fedora fedora-extras freesoft gentoo gnat gnome iupac kde kernel mysqlmr opencd oscd python sunfware tetex yellowdog collinux debian ecl gnu knoppix legacy mozilla netscape ooodev openof opensuse quantian redhat rfc rtfm slackware ubuntu vigyaan wikipedia
Update: Jul. 9th
After sorting things out it seems only one of two channels is affected by the disturbance - we managed to get one of the affected disks out of an array that now consists only of unaffected disks and works well at normal speed. The other half of the disks is still running very slow. Following mirrors answer slowly and maybe are not up to date because updating takes very long: collinux debian ecl gnu knoppix legacy mozilla netscape ooodev openof opensuse quantian quotas redhat rfc rtfm slackware ubuntu vigyaan wikipedia
Update: Jul. 16th
SunSITE was rebooted, a cable was exchanged. Some reconfiguration was necessary but overall operation and speed seem to be ok.

Storage failure Dec. 18th, 2007

Early in the morning (about 2:00) a fatal error occured on one of our "older" raid devices. SunSITE did a forced reboot and three of our six storage devices didn't manage to come back to normal business :-(
SUN support told us to wait until all fsck and resync operations have come to an end which will hopefully be the case in the morning of Dec. 19th...
So all the mirrors that have synced again during the last 10 days (see table below) are in an uncertain state now. And this time, even more are affected: FreeBSD, ibiblio, suse, eclipse and some smaller others are not available at the moment.
Update:
After one day of fsck'ing file systems and resyncing devices it seems that most of our mirrored data is still in place. Our resync table below should show the state of the mirrors correctly and hopefully we are returning to "normal operation mode".

Storage failure Dec. 1st, 2007

In the early morning of Dec. 1st (3:52) a hard disk in our largest raid device gave up - the raid manager immediately initiated a resync with the hotspare drive. 12 hours later a second disk hung with errors. The technically skilled may know what that means for a raid5 Volume - more than 1.5 Terabyte of mirrored content simply vanished from our server :-(
Most of the technical issues are currently solved and most of the affected mirrors are already up to date again, the rest will follow:

mirror	status
debian (main, non-US, cd-images)	online & up to date
wikipedia	online & up to date (Compact version)
knoppix	online & up to date
fedora	online & up to date
sunf(ree)ware	online & up to date
mozilla	online & up to date
tetex	online & up to date
opencd	online & up to date
mysql	online & up to date
eclipse	online & up to date
yellowdog	online & up to date
opensuse	online & up to date
kernel	online & up to date
ubuntu	online & up to date
python	online, syncing
redhat	online & up to date
(fedora)legacy	online, static

The following diary gives some more details about what happened:

Saturday Dec 1st, 2007 (continuing)

As two disks of the largest raid gave up within ~14 hours of each other, several of the major mirrors are down:

(debian, see below)
eclipse
fedora
kernel
knoppix
mozilla
mysql
opencd
opensuse
python
redhat
sunf(ree)ware
tetex
wikipedia
yellowdog

The second failed disk may have recovered itself, thus, a raid resync is in progress ... earliest results to be expected on Wednesday, 5th Dec.
(current Resync state: 58,8% done [Tue Dec 4th, 9:30 am]).

Update [Tue Dec 4th, 12:30 pm]

We do not expect the raid to have any data after the resync. Thus, a complete download of the mirrors (~1.5 TB) will have to be done, taking even more time.

Update [Tue Dec 4th, 1:20 pm]

Hard error in another disk, resync restarted, another broken(?) disk treated as "working" by the system ... not good.

Update [Tue Dec 4th, 5:50 pm]

We are just re-creating the debian mirror on one of the other raids; hopefully, it will be up and running again tomorrow.

Update [Wed Dec 5th, 12:50 am]

New hard disks arrived. The raid resync might be finished in the early hours of Thursday; yet the mirrors will still be void of data, then. Also, we are planning to re-arrange the raid disks to avoid a repetition of the current situation.
The debian mirror has finished downloading about a third of its data, will also take some more time.

Update [Thu Dec 6th, 9:00 am]

Debian mirror is back, up and running (on one of the older raids). Debian and debian-non-US only, for the moment, the cd images are still loading.
The other mirrors will take some more days, we fear.

Update [Fri Dec 7th, 11:05 am]

We split the "broken" raid into two, to avoid repetitions of this nice situation. The first half is up and running since last night, fedora and sunfware are being downloaded. Eclipse should also be available, soon.

last change: Reinhard Linde, 31.05.2010