A Machine Learning Community of Practice for Australia was launched last month to help anyone interested in machine learning for research to collaborate over the emerging needs in capabilities and expertise in the field.
The ML CoP, launched on 15 April, is open to user communities who apply or intend to apply ML to their research, irrespective of their expertise levels; practitioner communities who have hands-on expertise in ML and can help others apply it; training groups and volunteers interested in developing and delivering ML training; and infrastructure providers who host specialist infrastructure, such as high-performance computing.
“There is a general consensus that we need avenues for eResearch interest groups—data science, machine learning and artificial intelligence—to come together and discuss specific interests, challenges and opportunities together,” said Komathy Padmanabhan, a co-organiser of the ML CoP and the Principal Lead of the Data Science and AI Platform at Monash University.
Led by the Australian Research Data Commons (ARDC), the ML CoP will host meetings covering topics such as data curation and pre-processing, tools and libraries, skills, training, computing environments, and public testing and benchmarking data sets.
The ML CoP will also host events (training, showcases and networking), symposiums and discussion forums. The first showcase will occur online on Thursday, 24 June, 2pm–4pm.
To join the ML CoP, please subscribe to the ML4AU mailing list.
Environments to Accelerate Machine Learning-Based Discovery project
Australia’s new machine learning (ML) community of practice is one of the outcomes of the ARDC-funded, two-year Environments to Accelerate Machine Learning-Based Discovery project.
Another outcome is the ML4AU portal, launched in October 2020. ML4AU is a repository for all project resources, including training material, scripts, tool library, data library and a calendar of ML training events. The portal also makes the resources FAIR (findable, accessible, interoperable and reusable).
“ML4AU is freely accessible to anyone—contributed to by everyone and leveraged by everyone,” said Komathy, one of the project’s principal investigators.
To accelerate the adoption of ML techniques among Australian researchers, Monash and UQ have come together to work on the project to build environments that support ML at scale, provide access to targeted training, and support a national community of practice.
Project outcomes will include sharing ML and HPC expertise between the universities to create an integrated development environment for ML with access to advanced tools and standard public data collections; creation of tools to allow rapid deployment of ML workflows on a variety of systems; and development of new training materials and communities of practice to support a range of researchers, from those just getting started to advanced power users.
Using part of the Queensland Government’s Research Infrastructure Co-investment Fund (RICF), QCIF is supporting two UQ-based staff contributing specialist skills to co-design and deliver computational tools and training expertise for the project. As a result, staff and HDR students at QCIF Member universities may access the project’s machine learning workshops.
The project was sparked following an ARDC discovery project in early 2019 to survey researchers in Australia and New Zealand about the challenges of using ML techniques.
Komathy, from Monash University, said: “We knew from the pattern of activities running on the machines at our MASSIVE High-performance Computing Centre that there has been a surge in ML usage. We wanted to understand what challenges researchers are facing, either researching ML or using it.”
Dr Nick Hamilton, a Senior Machine Learning Consultant at UQ’s Research Computing Centre (RCC), added: “We were struck by the survey respondents’ diversity of fields, including Engineering and IT, Medicine and Health Sciences, Law, Business and Economics, Science, Art and Design, Environmental Science, Linguistics, AI and Agriculture.
“With machine learning becoming critical in so many fields, it is essential to develop new training courses and state-of-the-art but easy-to-use computational resources to enable researchers to rapidly exploit the opportunities proffered by recent advances in machine learning.”
The survey, jointly run by Monash and UQ, revealed four main challenges:
- Data availability and access – Despite this being the era of big data, availability and accessibility of data is challenging; for example, health data generated by hospitals is not available due to privacy concerns.
- Computing environments – Because high-performance computers are expensive, they are shared and there is always a waiting period.
- Tools – Researchers generally lack awareness of what tools and libraries are available and suitable, and of how to set them up on compute environments.
- Skills – Researchers are not sure how to apply advanced ML techniques in their research and are not aware of the key software packages.
As part of the project, Monash and UQ, in partnership with QCIF, have already delivered 140 hours of ML training to more than 900 participants from 22 organisations.
Five new courses on ML, deep learning and natural language processing have been developed so far and published under a Creative Commons licence. The team has also been providing ‘train the trainer’ sessions for other organisations to adapt the material for their own users.
The Environments to Accelerate Machine Learning-Based Discovery project began on 10 April 2020 and its expected completion date is 3 February 2022.
This article is largely based on one published by ARDC on 23 March 2021.