[Change Password (Staff)] [Change Password (Intern)] [Mail Archive] [Unix Accounts] [Services Offered] [/campus FileSystem] [Printer Config] [Software] [Meeting rooms] [General FAQ] [OpenAFS Windows] [Getting Help]

Announcements

3rd July 2009 ~ onwards

Please refer to BII's wiki for IT related annoucements and information.

2nd July 2009

Issues with mountpoint /mnt/DB in annDB and annDB-dev, affected the postgres sql DB running in them causing outage for annotator cluster
1) mountpoint /mnt/DB in annDB and annDB-dev became readonly at 2:30pm, /mnt/DB mounts FC LUN from ACRC storage group. This was due to a storage LUN failover. ACRC storage was alerted at around 3:30pm after a call was received by systems group from end-user. Multipath in annDB and annDB-dev were not configured.
2) After some modifications by Storage group, SAN connectivity to annDB/annDB-dev were restored and postgres was manually started at around 4:30pm
Plan for future - Migrate data in /mnt/DB to 6140 storage inside annotator rack. annDB and annDB-dev should only have FC connection to the annotator 6140 storage.

annotator-dev lost it default route
1) NetworkManager is running inside annotator-dev, there was an up-down on eth2 and the default route was removed.
Retifications - NetworkManager binaries removed from annotator and annotator-dev. No need to run NetworkManager for static IP.

12th Apr 2009

We were reshuffling corporate services around today. This was done to reduce dependency on servers with single power supplies, as well as to free up 1 unit of Sun T2000 which will be used for the mini DataCenter migration later this year. This in turn led to a loss of corporate services between 2:30pm and 5pm.

24th Mar 2009

At about 8:03pm today, the 4-port 10G card on the BlackDiamond auto rebooted for unknown reasons. When the link came back, there appeared to be packet forwarding inconsistencies on servers attached to the BlackDiamond's 1G copper ports. It was only until 9:26pm that the XOS software on the BlackDiamond (and only the BlackDiamond) was identified as the culprit. The switch was then rebooted and everything returned to normal.

19th Feb 2009

Network latency between BII and various local addresses served by the local ISPs Singnet and Starhub have suddenly risen from under 50ms to over 150ms. This problem has been reported to the network team, and they are looking in to it. (This problem was partially fixed on 24th Feb. Special thanks to Derrick)

11th Feb 2009

Yesterday evening at 18:31, the MGE UPS which powers the Foundry BigIron switch in the level 8 LAN room failed. Unfortunately this was a pretty serious failure, burning out some components and tripping the circuit. This resulted in a total loss of the end user Gen4 network on level 8. The network was restored at about 8:30 this morning by powering the switch directly off raw power.

6th Feb 2009

There was a temporary loss of AFS volumes on afs50 between 18:15 and 18:45. The fileserver was performing lots of unaccountable disk IO, thus it was decided to turn on auditing. This however, required a restart of AFS fileserver processes. The fileserver process was unresponsive and had to be forcably restarted. This in turn led to a mandatory salvager run which took 30 mins to complete.

27th Jan 2009

The 10G port on our BlackDiamond started flapping a few days ago, which led to several interruptions to traffic for servers between rows E to J. Most embarressingly, the backup network had not been configured until today. We don't know yet whether the flaps are caused by a bad GBIC or fiber patch, but the good news is that traffic is automatically rerouted within 40secs of link failure.

8th Jan 2009

At 12:55am this morning, our Sun StorEdge 3511 array malfunctioned. All 12 disks seemed to "disappear" from the controller. This led to loss of the filesystem storing all BII user emails. This was especially strange since the array has redundant controllers, power supplies and hot spare disks. Fortunately, after a cold reboot, the disks came back online, and so did the filesystem. The mail system resumed normal operation by 9:30am.

19th Dec 2008

One of the old AFS servers (afs21) suffered hardware failure today. The volumes residing on the server were not available between 12:05am and 9:50am this morning. Corporate services, whose binaries resided on that afs server include : LDAP, iproxy (squid), ntp, web and webmail. All volumes (and services depending on them) were back to normal by 10:00am.

16th Dec 2008

The routing problem which affected level 6 enduser networks as well as the link to the Imaging team's NAS has been fixed.

15th Dec 2008

A really strange hardware fault occured on the DHCP server today. Basically payload in packets was being mysteriously corrupted despite TCP/IP checksums being correct. Interestingly the problems went away once the box got replaced. The issue of duplicate mails on IMAP is now considered closed as we've not received any more reports of it happening in the last few days.

There is currently a known issue regarding high volume DNS queries. Basically when a DNS client sends a very high number of queries to the new DNS servers, the TippingPoint IPS blocks the DNS replies from returning to the client. After 1-2 mins of "silence", the IPS unblocks DNS replies again. Users within the Gen4 network, however, are not affected. (This issue was subsequently fixed by the network team in the first week of Febuary 2009)

5th Dec 2008

There was a disruption of the end-user network on level 6 for several hours this morning. This was due to hardware failure of a linux router. The box was replaced and the network was back to normal by 10:30am. The new mail servers (and AFS database servers) have been running for almost a week now. There is a known issue of IMAP users experiencing duplicate emails, we're looking into that issue presently.

25th Nov 2008

There will be a power shutdown for the three buildings --- Matrix, Genome and Proteos during the weekend of 30th November 2008.

The system team will take this opportunity to update its AFS servers to make things more scalable and robust. On all AFS clients, there will need to be some changes to your config files. We will be going around on monday, when the servers are up and running, to assess and reconfigure your machine so that your machine could connect to the new servers that we have. Alternatively, if you want to configure them yourselves, you can follow this guide.

However, do note that the new servers will only be effective on DECEMBER 1 2008. PLEASE DO NOT CHANGE YOUR SETTINGS BEFOER THIS DATE!

15th Sep 2008

Today is a strange day, the ldap master's berkeley DB files got corrupted. This in turn caused ldap replication to fail, and users were not able to change passwords. A Sun engineer came down to replace the faulty chassis switch on one of our T2000s. The webmail service was migrated off to another server, unfortunately there were a few minutes of downtime because of this.

9th Sep 2008

One of our Sun T2000 servers decided to power itself off due to a faulty chassis-open detection switch. This led to loss of webmail service and a partial loss of LDAP, and DNS services for about half an hour. We are working with Sun to address the faulty hardware.

13th Aug 2008

While configuring an IP in IP tunnel between the Gen4 network and the old tenant LAN linux router, a kernel fault on the tenant LAN linux router caused it to panic. The machine had to be reboot from console and service resumed after approximately 15 minutes of interruption to user traffic.

30th May 2008

There was a loss of email and dns services today between 3pm and 5:30pm. The problem was traced down to the main NFS server providing storage to these services. After a reboot and verifying that the necessary processes were in place, the problem still wasn't solved. Thus we tossed out the box and replaced it with another machine and that pretty much fixed things.

26th May 2008

Sun has delivered a whole bunch of new equipment that will be used for corporate servers upgrading. Their engineers have just finished checking that all hardware is delivered in good working condition.

9th May 2008

The webmail service has been upgraded to the latest version and moved to a Sun T2000 (used to run on a humble Sun v100). Users should expect better responsiveness now, and should also report any bugs or other unexpected behavior.

5th May 2008

There was a 45-minute loss in IMAP service due to an NFS configuration mistake which only showed up today (almost 1 month later) !

9th Apr 2008

After 6 years of service, one of apps10's CPU has failed. The IMAP service was desrupted temporarily due to the hardware fault. The IMAP service has been switched over to another 280R server (apps9).

8th Oct 2007

We've added new AFS servers for bulk storage since the fileservers handling home directories don't have much disk space. This allows the department shared storage areas, of which each department gets 200GB (to start of).

25th May 2007

The Systems Team has implemented the Andrew File System (AFS) in BII.

9th Oct 2006

We now have 14TB of storage, built for under S$30k.

25th July 2005

The IT Systems webpage(s) are now integrated under the BII Intranet. Users should now access all content and/or resources from the Intranet.

17th June 2005

Our (faithful but old) nfs server (sun v280) and disk array (sun a1000) has been replaced with a new v240 and storedge 3511 array. In addition to this change, we've made some changes to email delivery to improve mail routing.

12th May 2005

Updated software page with latest versions of putty, pscp, winscp, Mozilla, Firefox and Thunderbird.

14th Sep 2004

Updated software page with latest versions of Mozilla, Firefox and Thunderbird. Added Mac versions of the abovementioned software.

Added FAQ on stopping SPAM.

18th Jun 2004

Updated software page with latest versions of Mozilla, Firefox and Thunderbird.

7th Jun 2004

Updated look for the main page and added the mail archive retrival page. Please contact systems team if you have any suggestions/problems.

23rd Dec 2003

We now have a proper website for us to put interesting things. We hope that you will find useful information here. Please contact us if you find anything offensive in here, or would like to see some extra material.