Diskruimte

From Cncz
(Redirected from Hardware randapparatuur)
Jump to navigation Jump to search

Diskspace

The diskspace on C&CZ servers can be used from all kind of C&CZ servers and personal computers, but also from other PCs or even from home with WinSCP or VPN. Almost all disks that are managed by C&CZ, are being backed up regularly, in order to be able to restore data in case of small or large calamities.

Home directories

Every user with a Science login has or is entitled to an amount of disc space of a few Gigabytes on a server. This disc space is called the "home-directory" on Unix/Linux computers and the "H- or U-drive" on Windows-computers. The location of this homedirectory (which server) can be viewed on the Do-It-Yourself website.

Naming

Username guest204
SMB (Windows/...) name \\home1.science.ru.nl\guest204

or
\\home2.science.ru.nl\guest204
check on DIY

URL (Apple/Linux/Android/...) name smb://home1.science.ru.nl/guest204

or
smb://home2.science.ru.nl/guest204
check on DIY

NFS (C&CZ beheerd Linux) name /home/guest204

Access rights

Long ago the (Unix) home directory of a user, except for a few protected areas, was readable for all users of the server. Today the home directory of a user can be accessed by the user himself. The user can change the access rights. C&CZ checks for home directories that are writable by other users.

Access through NFS

Mounting a home (U:) drive on Linux via NFS/Kerberos.

Functionality and costs of network shares

RAID server shares

Diskspace for groups/institutions/projects: there are a few fileservers with RAID storage with partitions that can be rented for a period of 3 years. The price for new discs or a new 3 year extension of an older disc is per July 2018 for FNWI departments:

size incl. backup without backup
ca. 200 GB € 40 per year € 10 per year
ca. 400 GB € 80 per year € 20 per year
> 400 GB up to 1 TB (no daily backup?) € ??? per TB/year € 50 per TB/year
> 1 TB (no backup?) N/A Have a look at Ceph storage

Although even the cheapest version is much more expensive than buying 1 disk for 1 PC, it often makes sense, because of the reliability (redundant disks, backup, support contract) and security (stable server). One or more folders on such a partition can be mapped as a network drive on Windows PCs or NFS-mounted on Unix/Linux hosts. The ability to read and/or write files on these folders can be limited to a group of logins. That group can be managed by the department on the Do-It-Yourself website.
C&CZ has service contracts for these servers and has spares on site, so a failure can be resolved quite fast. Because the disks are part of a RAID set, the failure of 1 single disk or even 2 disks, will not give a disruption of service for users. The partitions are backed up (daily and incremental). Even in the case when the whole server room is lost, data can (eventually) be restored.

Ceph Storage

Starting November 2019 we can provide almost unlimited storage for the Faculty of Science using our Ceph storage cluster. The way Ceph works there is a tradeoff for performance and redundancy. Also it is possible to improve redundancy above single server RAID-6 level, with the additional redundancy options. The physical storage servers are spread accros three locations (datacenters). NB Ceph volumes have no backups, the volumes tend to be too large to backup.

Choices in redundancy

Ceph has different options for storing data (configurable per "pool"). By default, Ceph stores data with 3 copies, so when one copy is lost, the remaining two still have redundancy. Now, because we have three locations, the 3copy pool will remain available when one whole datacenter becomes unavailable. When we started, we created a 4copy pool, so it would remain available when a location was off-line, but with three locations, this adds little to the redundancy.

Besides storing copies of the data blocks, Ceph can use "Erasure Coding" (EC) as alternative way of providing redundancy. The advantage is that much less overhead is required for secure storage, but the disadvantage is high overhead for storing small files. We have several different EC pools; EC8+3, the cheapest, but when one datacenter is destroyed, all the data is lost (very unlikely!), when one datacenter becomes temporarily unavailable, the data is still safe, but off-line. Our EC5+4 pool remains available when a whole datacenter is offline or lost, the data remains safe as long as two datacenters are working well.

Ceph Erasure coding has a high overhead for smaller files, the prices mentioned below are based on the optimal storage overhead, which can be approximated when files stored are at least 4 megabytes or larger.

NB, 1 TB is 1.000.000.000.000 bytes

Pool why price per TB per year without backup
Erasure coding 8+3 (*) cheap € 50 (was 45)
Erasure coding 5+4 cheap + additional redundancy € 60
3 copy faster r+w € 100
4 copy (**) faster r+w + additional redundancy € 135

* EC8+3 price is raised from 45 to 50 euro as of January 1st 2022, due to the way the storage is used with large amounts of very small files. This creates a significant overhead in Ceph.

**4copy The 4copy pool used to have the advantage, when we had only two locations, that it would remain usable when one datacenter location failed. Now the 3copy pool will remain active when one datacenter breaks, because we have three locations. Of course, the 4copy pool will have the additional benefit of an extra copy, but this has no advantage when a whole datacenter breaks.


The Ceph storage can be used as Windows/Samba share, NFS share or S3 object store. Object store differs fundamentally from a normal filesystem, so data stored in a Windows or NFS share cannot be accessed using the S3 protocol.

The performance properties of Ceph are different from traditional single server storage; write speed usually exceeds read speed and lots of small files is killing for throughput, even worse than on traditional storage.

Naming

Volume name sharename
SMB (Windows/..) name \\sharename-srv.science.ru.nl\sharename
URL (Apple/Linux/Android/...) name smb://sharename-srv.science.ru.nl/sharename
NFS (C&CZ-beheerd Linux) name /vol/sharename

Access rights

Most of the shared disks can be read and written by a specific group of users. The owners of this group can administer on the Do-It-Yourself website which accounts are a member of this group.

Requests

A request for one or more network discs should contain:

  • requested name of the disc(s)
  • requested size (max ca. 500GB with backup)
  • possibly requested backup schedules to lower the price (Daily/Monthly/Yearly)
  • Science loginname of an owner
  • possibly Science loginname of a member
  • charge account (kostenplaats) or project code for the costs in the first three years.

Temporary shared diskspace

Every now and then you want to send one or more large files (more than a few tens of MBs) to someone else within the Faculty, mail is unsuited for those large files. To make this easy, one can use a network share, where one can store large files temporarily in order to have someone else copy the files from this location. Note that this is explicitly meant for temporary storage, we do not make backups of this share, every day we remove files older than 21 days old. When copying files to this share, make sure the file timestamps are updated. Some copy programs (like rsync) maintain the original timestamps and older files will be deleted. To update timestamps, you can use the following command:

find . -exec touch {} +

This share can also be used to store temporary files only readable for yourself by using a different name for the share. Note that also in this case, old files will be removed.

Please create a subdirectory with your name first, and put your files in that directory.

For files totaling less than 250GB, also Surfdrive is an alternative. For sending files up to 500GB SURFfilesender can be used.

Naming

Volume name temp
SMB (Windows/..) name \\temp-srv.science.ru.nl\share

or
\\temp-srv.science.ru.nl\onlyme

URL (Apple/Linux/Android/...) name smb://temp-srv.science.ru.nl/share

or
smb://temp-srv.science.ru.nl/onlyme

NFS (C&CZ-beheerd Linux) name /vol/temp

Access rights

  • Readable by all users: share
  • Only readable for the owner: onlyme

Lijn.png