Categorie:Storingen
Contents
- 1 Current Service Interruptions and Maintainance
- 2 Recently Resolved Service Interruptions and Maintainance
- 2.1 Host of several virtual servers broken: Roundcube, websites and others
- 2.2 Lilo6 down
- 2.3 Major RU network maintenance Saturday Feb. 27 08:00-20:00
- 2.4 DNS problems from outside with ru.nl
- 2.5 DNS broken for subdomains of ru.nl
- 2.6 Gitlab upgrade
- 2.7 Science VPNsec disruption
- 2.8 DIY temporarily not usable
- 2.9 Science smtp service temporarily not usable
- 2.10 Very long mail aliases temporarily not usable
- 2.11 Switch crash; gitlab+mattermost, licenses and DHZ
- 2.12 Gitlab upgrade
- 2.13 Eduroam problem on campus
- 2.14 RU mail erroneously in Spam folder
- 2.15 Webserver 'havik' offline
- 2.16 Science radius disruption
- 2.17 Webserver 'havik' offline
- 2.18 Webserver 'havik' offline
- 2.19 CN00 Slurm master ubuntu 16.04 down
- 2.20 Sperwer Database server storing
- 2.21 Science VPN disruption
- 2.22 Science datacenter network problem
- 2.23 Jitsi.science.ru.nl not working properly
- 2.24 Mailserver certificate problem
- 2.25 Problems with a virtual host
Current Service Interruptions and Maintainance
New Radius server for ru-wlan and eduroam (wireless
On Monday, January 28th 2013 at 8:00 pm, one of the servers that is being used by the wireless network of the RU, will be replaced. This replacement will affect you as a user of the wireless networks ru-wlan and eduroam: There will appear a new certificate when connecting. You can just accept this, after which the connection should work. If this appears not to be the case, then it’s best that you remove your old Eduroam- respectively your old RU-WLAN settings first to activate the new connection .
Specifically for iPhone / iPad users: We recommend that you first remove your old Eduroam- respectively your old RU-WLAN profile before activating the new connection without a profile. If that unexpectedly fails, please review the information on www.ru.nl/wireless for iPhone/iPads. If necessary, you can also download a new profile from that site.
Marcel Kuppens 17 jan 2013 10:53 (CET)
Horde webmail server down because of spam
Begin : 20121116 04:05 End : 20121116 09:00 Affected : Users of horde webmail
Last night, horde webmail has been misused for sending spam. This happened because a naive user gave the Science password to spammers. We have stopped horde, disabled the account of the naive user and restarted horde. Ben Polman 16 nov 2012 08:20 (CET)
EduroamCAT not working with Science accounts
Begin : 2019-02-28 00:00 End : ? Affected : EduroamCAT users with Science accounts
EduroamCAT is the Eduroam Configuration Assistant Tool for many different devices. However, this hasn't (yet) been set up for the use of Science accounts (). C&CZ is looking for a solution. In the meantime Eduroam connections have to be configured manually (please consult www.ru.nl/wireless) or using the U/S/E number.
Recently Resolved Service Interruptions and Maintainance
To be quickly informed about service interruptions one can subscribe to the CPK mailinglist.
Host of several virtual servers broken: Roundcube, websites and others
Begin : 2021-03-05 07:45 End : 2021-03-05 09:40 Affected : users of the virtual servers: Roundcube, websites with databases on this server, ...
Yesterday the SSD bootdisk of this VM host reported the first problems. This morning this had the effect of stopping all VMs running on this host. By moving the VMs to a different VM host, the problem has been solved. We will investigate how to best prevent this problem in the future or lessen its impact.
Lilo6 down
Begin : 2021-02-25 17:30 End : 2021-03-04 16:45 Affected : users of lilo
As of Thursday afternoon, the lilo6 is down due to hardware issues. Because lilo6 was the default linux login server (lilo referred to lilo6), this affected many users of lilo. The impact is limited, because we have alternative lilo's, namely lilo5 and lilo7. As of March 1st lilo now refers to lilo7, ssh will warn about DNS SPOOFING, which is due to the difference host keys for lilo7
ECDSA SHA256:si3g2elo5m6TShx3PjX0+vF50pZ8NK/iXz/ESB+ZeP0
Major RU network maintenance Saturday Feb. 27 08:00-20:00
Begin : 2021-02-27 08:00 End : 2021-02-27 20:00 Affected : users of the RU network or services
The ISC announced that Saturday February 27 08:00-20:00 major RU network maintenance work will be carried out. This will mean that all RU services will be unavailable several times for at most an hour. This concerns all RU services including those of FNWI/C&CZ: e-mail, VPN, wifi, BASS, OSIRIS, Brightspace, Syllabus+, Corsa, etc.
DNS problems from outside with ru.nl
Begin : 2021-02-21 07:10 End : 2021-02-23 14:30 (?) Affected : everyone trying to access something in ru.nl from off-campus.
The central DNS servers for ru.nl for external requests had problems, because they received too many requests, which resulted in science.ru.nl and others not being found. DNS names within ru.nl then will not resolve to an IP address. We enlarged some TTLs (Time-To+lives) to try to lessen the problem. These small TTLs were meant to be able to move a service to a new server in case of problems, but now they just make the problem bigger. After starting VPN you won't notice this problem, because the internal DNS servers that you use then are not affected. Changes to the RU DNS servers hopefully lessened or removed the problems as of 2021-02-23 14:30.
DNS broken for subdomains of ru.nl
Begin : 2021-02-11 ~11:15 End : 2021-02-11 ~13:00 Affected : everyone trying to resolve *.science.ru.nl *.astro.ru.nl etc.
DNS-servers for ru.nl did not serve information about subdomains such as science.ru.nl. Thus no DNS-name will resolve to an IP address at FNWI. A workaround is to use as DNS servers: 131.174.224.4 en 8.8.8.8. If you try to connect to a service for the first time after ca 11:15, you'll get an error like: "No such domain" or "Cannot resolve". Restarting RU DNS servers at 12:45 may have fixed the problem. Without a real explanation, the problem went away after a few hours.
Gitlab upgrade
Begin : 2021-02-07 04:00 End : 2021-02-07 12:50 Affected : GitLab and Mattermost users
Services will not be available for a while because of a GitLab and Mattermost upgrade.
Science VPNsec disruption
Begin : 2021-02-03 13:00 End : 2021-02-03 14:02 (for Apple macOS/iOS last fix on February 10) Affected : Users of Science VPN
The expiration date of the certificate of our VPNsec service was apparently not yet checked regularly. This made it possible for the certificate to expire. We put a new certificate into place within an hour. Of course we will check this certificate regularly from now on. For Apple/Mac we needed to construct a new mobileconfig, this took some time, because in the meantime RU had moved to a different Certificate Authority. For Apple macOS this was ready at the end of Feb. 4, with a new installation procedure. For Apple iOS (iPhone/iPad) the old profile has to be deleted and a new mobileconfig has to be installed.
DIY temporarily not usable
Begin : 2021-01-25 07:15 End : 2021-01-25 07:45 Affected : Users wanting to manage their science account
Due to a management operation (planned around this time), the DIY website was unusable. Since the time was very early, it's expected nobody was inconvenienced by this temporary unavailability.
Science smtp service temporarily not usable
Begin : 2021-01-22 10:00 End : 2021-01-22 10:30 Affected : Science mail users wanting to send mail
A configuration change unwantedly made the smtp service unusable. When we noticed this, it was repaired immediately.
Very long mail aliases temporarily not usable
Begin : 2021-01-21 15:52 End : 2021-01-22 09:55 Affected : Science mail aliases of more than 1024 characters
A configuration change had as unwanted effect the disappearance of all very long mail aliases. When this was reported next morning, it was repaired immediately.
Switch crash; gitlab+mattermost, licenses and DHZ
Begin : 2021-01-07 ~14:30 End : 2021-01-07 ~15:00 Affected : GitLab and Mattermost users, Licenses, DHZ (diy)
Due to a simple management command the switch (as-ak008-04) crasht and had to be reset manually. The switch sits between the network and servers for gitlab+mattermost, licenses and the database for DHZ(diy).
Gitlab upgrade
Begin : 2020-11-27 04:00 End : 2020-11-27 ~08:00 Affected : GitLab and Mattermost users (including PEP)
Services will not be available for a while because of a GitLab and Mattermost upgrade.
Eduroam problem on campus
Begin : 2020-07-10 evenng End : 2020-07-10 evening Affected : Eduroam users on campus
The ISC announced: For security reasons, the certificate of the wifi server will be replaced in the evening of Friday, July 10. This has consequences for connecting your mobile device to Eduroam when you’re on campus:
• If you get the message that you have to accept the new certificate to use eduroam, choose 'yes'. You can then use eduroam again;
• If you don't get this message and can't connect to Eduroam, choose the wireless network 'eduroam-config'. Accept the terms and conditions. Follow the instructions to reinstall Eduroam.
More information can also be found at www.ru.nl/ict-uk/eduroam (you will need an internet connection for this).
If you have any questions, please contact the ICT Helpdesk (024 - 36 22222).
RU mail erroneously in Spam folder
Begin : 2020-03-25 17:52 End : 2020-07-07 13:13 Affected : FNWI employees with Science mail
March 25, a rule "2020 Radboud Universiteit" was added to the Science spamfilter. Recently, this matched RU-central mailings. Therefore RU-wide mailings from e.g. the RU Board and Radboud Recharge have erroneously been delivered in the Spam folder of Science emplyees. The Science spamfilter tries to fight spam and phishing, this is partly manual work in which errors can't be excluded. C&CZ apologizes for the inconvenience this has caused.
Webserver 'havik' offline
Begin : 2020-06-18 15:45 End : 2020-06-18 16:25 Affected : Users of various websites.
Several parts have been replaced. We assume the problem, that occurred twice, is now resolved. For dual-boot pcs, the boot menu was served by an alternative method during the repair.
Science radius disruption
Begin : 2020-06-17 11:11 End : 2020-06-17 11:56
Affected : Users of Science VPN and Eduroam based on science account
The certificate of the LDAP servers has been replaced this morning, this has also changed the certificate chain. The radius server uses LDAP as authentication backend and in the radius configuration the certificate chain had to be changed too. This was initially overlooked. Radius is the authentication mechanism used by all VPN servers and Eduroam
Webserver 'havik' offline
Begin : 2020-06-17 03:38 End : 2020-06-17 08:52 Affected : Users of dual boot PC's (the dual-boot menu is served by a website) and various websites.
The server went down in the same way as the previous time (3rd of June 2020). The cause is most likely a system board problem. This part will be replaced Tomorrow by a support engineer.
Webserver 'havik' offline
Begin : 2020-06-03 06:30 End : 2020-06-03 10:12 Affected : Users of dual boot PC's (the dual-boot menu is served
by a website) and various websites.
The server couldn't be reached after the scheduled weekly reboot, not even on its management interface. Because also C&CZ employees work from home and the interruption didn't get enough urgency fast enough, the interruption lasted too long, apologies for that. The support partner has been contacted and the server has been updated, but the origin of the problem is still unclear. We will also look at making these services more redundant or more easily movable to a different server.
CN00 Slurm master ubuntu 16.04 down
Begin : 2020-05-18 09:50 End : 2020-05-19 12:15 Affected : slurm on ubuntu 16.04 (cn07)
Due to a failed BIOS upgrade, the hardware of the database server appears to be bricked. We transfered the disks to another machine (cn00) and all database services are now up again, at the cost of not having cn00 running. When the hardware is working well again, we will swap it all back and restore the original situation.
Sperwer Database server storing
Begin : 2020-05-18 06:30 End : 2020-05-18-10:00 Affected : various websites and slurm
Due to a failed BIOS upgrade, the hardware of the database server appears to be bricked. We transfered the disks to another machine (cn00) and all database services are now up again, at the cost of not having cn00 running. When the hardware is working well again, we will swap it all back and restore the original situation.
Update May 19th, 12:15 : hardware fixed, situation back to the original state.
Science VPN disruption
Begin : 2020-05-06 05:00 End : 2020-05-06 08:00 Affected : Users of Science VPN
Unexplained crashes starting around 5am on the host system. System offline at around 6am. After a hard reset around 08:00, all seems to be all right again.
Science datacenter network problem
Begin : 2020-04-30 12:08 End : 2020-04-30 21:44 Affected : Users of Ceph storage and a few new compute clusternodes
A broken transceiver caused flapping af a 100 Gb/s connection between two C&CZ datacenters. Hours later the flapping increased, which took down the complete redundant new connection between the two server rooms. When this was noticed, a workarpound was found quickly by shutting down the interface with the broken transceiver. With this the connection was restored. De broken transceiver has been replaced thanks to a swift action from our supplier. Now we have these spare parts ready to use. We asked the supplier whether a configuration change will make the connection more redundant, that just one broken transceiver will not take down the connection.
Jitsi.science.ru.nl not working properly
Begin : 2020-04-19 15:00 End : 2020-04-20 11:40 Affected : Users of jitsi.science.ru.nl
Due to performance tuning having gone wrong the jitsi.science.ru.nl conference rooms cannot be joined by more than one person at the moment. Solved by reinstalling server.
Mailserver certificate problem
Begin : 2020-04-13 14:00 End : 2020-04-13 14:35 Affected : Users Science mail
The new certificate of the Science mailserver hadn't yet been placed in the right place. The expiration of the old certificate caused a problem for Science mail users, that was resolved by replacing the old certificate.
Problems with a virtual host
Begin : 2020-02-18 05:30 End : 2020-02-18 09:08 Affected : Users of mx3, smtp3, crestron, gitlab (PEP), goudsmit, msql01 and labservanttestvm.
The virtual machine host 'oscar' could not boot. Again, a broken LVM snapshot caused the problem.
Archived service interruptions can be found in the service interruptions archive.
Be quickly informed via the CPK mailinglist or the RSS feed
Pages in category "Storingen"
The following 5 pages are in this category, out of 5 total.