ICTDAT503
Use unsupervised learning for clustering


Application

This unit describes the skills and knowledge required to cluster data extracts from big data following unsupervised machine learning methodologies and report on the findings.

It applies to individuals who work in roles including, data analysts, data scientists, machine learning engineers, developers and programmers, and are responsible for data mining and machine learning activities with big data within medium to large organisations.

No licensing, legislative or certification requirements apply to this unit at the time of publication.


Elements and Performance Criteria

ELEMENT

PERFORMANCE CRITERIA

Elements describe the essential outcomes.

Performance criteria describe the performance needed to demonstrate achievement of the element.

1.Determine data clustering requirements

1.1 Research organisation’s need for data clustering and define problem, objective and outputs

1.2 Determine required machine and input data set according to task requirements

1.3 Define evaluation protocol and accepted measure of success

1.4 Develop and document required benchmark model

2. Prepare data

2.1 Collect data according to task requirements

2.2 Evaluate data quantity, completeness and alignment according to task requirements

2.3 Transform and format data according to specifications

2.4 Finalise data preparation according to task requirements

3. Cluster data

3.1 Input raw data according to task requirements

3.2 Run required algorithm and adhere to required processing time frame

3.3 Obtain output reports and determine completeness of task according requirements

4. Finalise data clustering tasks

4.1 Analyse data report and determine clustering tasks have been completed according to task requirements

4.2 Interpret, summarise and document findings

4.3 Communicate findings to required personnel and seek and respond to feedback

4.4 Lodge documentation according to task requirements and finalise task activities according to organisational requirements

Evidence of Performance

The candidate must demonstrate the ability to complete the tasks outlined in the elements, performance criteria and foundation skills of this unit, including evidence of the ability to:

collect, prepare and cluster data using unsupervised machine learning methodologies and report on the findings on at least two occasions.

In the course of the above, the candidate must:

research industry standard approaches and methodologies for machine learning

evaluate and prepare data.


Evidence of Knowledge

The candidate must be able to demonstrate knowledge to complete the tasks outlined in the elements, performance criteria and foundation skills of this unit, including knowledge of:

methodologies for data clustering unlabelled data including intra-cluster cohesion and intra-cluster separation

industry standard data clustering methodologies including benchmark modelling techniques for data clustering

report writing methodologies relevant to reporting findings of data clustering activities

industry standard machine learning methodologies relevant to unsupervised learning

methodologies for modelling data relevant to unsupervised learning.


Assessment Conditions

Assessment must be conducted in a safe environment where evidence gathered demonstrates consistent performance of typical activities experienced in the customer service field of work and include access to:

hardware and software and components required for using unsupervised learning for clustering

organisational data reporting style guide and reporting processes required for unsupervised learning and machine learning

a site where activities can be carried out.

data required for clustering.

Assessors of this unit must satisfy the requirements for assessors in applicable vocational education and training legislation, frameworks and/or standards.


Foundation Skills

This section describes those language, literacy, numeracy and employment skills that are essential to performance but not explicit in the performance criteria.

SKILL

DESCRIPTION

Numeracy

Uses mathematical formulae to calculate required measurements, determine values and articulate numerical findings

Oral communication

Uses listening and questioning techniques to seek and respond to feedback

Reading

Analyses technical, manufacturer and organisational documentation to determine and confirm job requirements

Writing

Prepares complex documentation detailing benchmark model and findings using relevant language to convey explicit information, requirements and recommendations

Planning and organising

Uses a formal, logical planning processes together with an increasingly intuitive understanding of context

Problem solving

Uses nuanced understanding of context to recognise anomalies and subtle deviations to normal expectations, focusing attention and remedying problems as they arise

Self-management

Takes full responsibility for identifying and considering relevant organisational protocols and requirements

Uses systematic processes, setting goals, gathering required information and identifying and evaluating options against agreed criteria

Technology

Identifies principles, concepts, language and practices associated with the digital world


Sectors

Data analytics