Astronomy faces a data avalanche. Breakthroughs in telescope, detector, and computer technology allow astronomical instruments to produce terabytes of images and catalogues. These datasets will cover the sky in different wavebands, from gamma- and X-rays, optical, infrared, through to radio. In a few years it will be easier to “dial-up” a part of the sky than wait many months to access a telescope. With the advent of inexpensive storage technologies and the availability of high-speed networks, the concept of multi-terabyte on-line databases interoperating seamlessly is no longer outlandish. More and more catalogues will be interlinked, query engines will become more and more sophisticated, and the research results from on-line data will be just as rich as that from “real” telescopes. Moore’s law is driving astronomy even further: the planned Large Synoptic Survey Telescope, capable of scanning the entire sky every three nights, will produce over 7 petabytes per year by the time it is operational in 2012. These technological developments will fundamentally change the way astronomy is done.
International
Virtual Observatory Alliance
In order to address the challenges associated with the storage and management of the enormous amounts of data generated by these instruments, the International Virtual Observatory Alliance (IVOA) was formed in 2002, with a mission to facilitate the international coordination and collaboration necessary for the development and deployment of the tools, systems and organizational structures necessary to enable the international utilization of astronomical archives as an integrated and interoperating virtual observatory. As of January 2005, there were 15 funded VO projects, from Australia, Canada, China, Europe, France, Germany, Hungary, India, Italy, Japan, Korea, Russia, Spain, the United Kingdom, and the United States. The VO is a system in which the vast astronomical archives and databases around the world, together with analysis tools and computational services, are linked together into an integrated facility. It aims to enable new science by enhanced access to data and, more importantly, providing the computing resources required to analyse the data.
VOTable
One of the major accomplishments of the IVOA was the development of a standard format for data transfer, known as VOTable. The VOTable format is an XML standard for the interchange of data represented as a set of tables. In this context, a table is an unordered set of rows, each of a uniform format, as specified in the table metadata. VOTable has built-in features for big-data and Grid computing. It allows metadata and data to be stored separately, with the remote data linked. Processes can then use metadata to `get ready' for their input data, or to organize third-party or parallel transfers of the data. Remote data allow the metadata to be sent in email and referenced in documents without pulling the whole dataset with it: just as we are used to the idea of sending a pointer to a document (URL) in place of the document, so we can now send metadata-rich pointers to data tables in place of the tables themselves. The adoption of this format ensures interoperability between all virtual observatories worldwide - while individual observatories can store telescope data in any format, virtual observatory compliance requires that they be able to describe and provide the data in VOTable format.
As well as developing standards, VO groups around the world are working on issues such as query languages, data curation and software applications.
Aus-VO
The Australian Virtual Observatory (Aus-VO), founded in 2003, is a collaboration between six Australian universities, along with the Australian Telescope National Facility, Mt Stromlo and Anglo-Australian Observatories and the Victorian Partnership for Advanced Computing (VPAC). The project aims to build a virtual observatory environment linking the archives of all Australian telescopes and which will interface seamlessly with other international virtual observatories.
Astronomers will explore Aus-VO and the IVO using advanced data mining and visualisation tools. These tools will exploit a unified data interface to enable cross-correlation and combined processing of data from otherwise disparate sources.
Researchers from the University of Queensland's Astrophysics group are engaged in developing the Australian Virtual Observatory by implementing a number of new search techniques and features. Consider the problem of matching an optical (dense) catalogue to a radio (relatively sparse) catalogue. There can be multiple optical counterparts for every single radio source - but how do we pick out the correct optical match and ensure that it is the optimal counterpart to the radio detection? David Rohde is completing his PhD on the statistical problems of matching such catalogues. Alejandro Dubrovsky is developing a catalogue matching web tool.
QCIF is supporting the Aus-VO project by providing data storage facilities. The tape storage facilities at UQ will be undergoing a 100 terabyte upgrade, along with another 10 TB of fast access disks, bringing QCIF's total data storage capacity (located both at UQ and JCU) to 270 TB. Data storage facilities at Monash University in Melbourne and at the APAC facilities at ANU are undergoing similar upgrades.
Contacts
David Rohde, Alejandro Dubrovsky,
Dr Michael Drinkwater
Astrophysics Group, UQ
Australian Virtual Observatory
Publications
D.J. Rohde, M.J. Drinkwater, et. al. "Applying machine learning to catalogue matching in astrophysics". Mon. Not. R. Astron. Soc. 360, 69-75 (2005).
