Sucking it up

While getting an RDSI allocation is the first step towards making nationally significant data available for sharing, loading data on to RDSI storage can take time.

Data may already be on networked storage, in which case transferring it across the network is one way to get data ingest going. For large datasets, though, this can cause logjams as the data has to compete with other network traffic.

In any case, lots of data sits on portable hard drives - in labs, offices, workrooms and homes. If a researcher has data on multiple hard drives, it can be difficult, and time-consuming, for them to upload it.

QCIF has two tools to help speed up ingest – ‘Hoover’ and Dustbuster. ‘Hoover’ is a computer with multiple USB ports. Up to eight portable hard drives can be connected simultaneously, and ‘Hoover’ will copy the data. A researcher recently brought in more 37 2 TB hard drives of genomics data for us to ‘Hoover’.

The benefit is that the researcher does not have to do the feeding in of data – once ‘Hoover’ has ingested its latest ‘meal’, we can push the data to its RDSI storage via a fast 10GB link.

Dustbuster is a 20 TB portable hard drive. It is useful for getting data from computers that cannot be physically brought in to QCIF’s offices. Since QRIScloud takes data from researchers all around Queensland, Dustbuster allows us to go and fetch data from many different locations.

This is particularly useful for universities on slow connections who do not want to saturate their networks with data transfers.