Data Linkage services available

The Data Linkage Services team facilitate access to linked data for approved purposes, such as service evaluation, planning and research.  

Data extraction

The process of extracting data for projects involves multiple teams at the Department of Health and other external agencies. Once necessary approvals have been granted (e.g. by relevant Ethics Committees or stakeholders), the Data Outputs team extracts requested data according to the extraction process.

Data linkage

Data linkage is a technique for connecting information from different data sources that are thought to relate to the same person, family, place or event.

Information is created when a person encounters a certain service, for example, when they visit an emergency department, stay in hospital or register the birth of their child. 

Data linkage techniques in WA have been developed to ensure the best possible matching while also protecting personal privacy. Linked records typically consist of two components: 

  • Demographic data — identifiable information such as a person's name or address
  • Content data — information about what happened to the person, such as diagnosis and hospital treatment. 

The WA Health Minimum Data Requirements for Linkage document outlines the mandatory requirements and recommended items to enable the application of high quality and efficent data linkage practices, and to maintain the integrity of the WA Data Linkage System. 

Privacy is protected by separating the content data from the demographic data before it is provided for linkage. This practice is known as the 'separation principle'. Specialised computer programs do most of the matching, but for some of the more difficult matches, data engineers will interpret the records and make a decision on whether it is a 'true match'.

The Data Linkage team matches the demographic information, and then makes a unique ID, called a 'linkage key', for each group of records that belong to one person. These keys can then be used for approved requests, to join the content data of the records, without releasing the person's name or other identifying information. 

The WA Data Linkage System (WADLS) stores the linkage keys created by the Data Linkage team. To create and maintain the WADLS, the WA Department of Health's Data Linkage and System's teams have developed a besoke linkage system in house, termed 'DLS3'. The system is highly versatile, and completely integrates and streamlines all aspects of the 'end to end' linkage process. 

For more information about the utility of linked data, refer to:

Project facilitation and advice

ISPD Client Services offers a centralised service to assist Data Applicants with advice on available datasets, project design, data governance, logistics for data delivery and cost estimates. As part of this service, the ISPD Client Services team coordinates the application for data process.

For enquiries related to non-research requests, contact ISPDClientServices@health.wa.gov.au.

For enquiries related to research requests, contact DataServ@health.wa.gov.au.

For enquiries related to ethics or ethical approval, contact HREC@health.wa.gov.au.

For enquiries related to research governance, including site authorisation, contact DoH.RGO@health.wa.gov.au.

Derived Aboriginal and Torres Strait Islander status flag

The Data Linkage Services team can generate a derived Aboriginal and Torres Strait Islander status flag as an additional linked data product.

A validated algorithm is used to create this flag for any individual with at least one record in a number of WA government administrative data sets where Aboriginal and Torres Strait Islander status is recorded.

The algorithm uses the information from several records in an individual’s chain to produce an overall derived Aboriginal and Torres Strait Islander status of “yes”, “no” or “missing”, for that individual.

Datasets used to derive Aboriginal and Torres Strait Islander status include:

  • WA Birth Registrations
  • WA Death Registrations
  • Midwives notifications
  • Hospital Morbidity Data Collection Records
  • Emergency Department Data Collection Records

The number of records used to assign this information is varied depending on the number and type of datasets and records in the chain. Data recipients must note that their project could receive data for an individual where data sets provided report an Aboriginal and Torres Strait Islander Status of ‘no’ but the Aboriginal and Torres Strait Islander Status Flag is ‘yes’.

The Aboriginal and Torres Strait Islander status flag indicates what status is indicative of a person from all available collections/records, and therefore may be different to what is reported in a specific record or collection.

This algorithm is based on the outcomes of the Getting Our Story Right project, a cross-agency collaboration between WA Health, the Australian Bureau of Statistics, and the Kids Research Institute, which recommended how to best use existing information resources to measure the gap in Aboriginal and Torres Strait Islander disadvantage.

You can read more about development of the algorithm in this academic journal article (Christensen et al, 2016).

Research projects involving Aboriginal and Torres Strait Islander people and communities should seek ethical approval through the Western Australian Aboriginal Health Ethics Committee (WAAHEC). See our Ethics page for more details. 

Family connections

The WA Family Connections System contains links between individuals who are related, created using information recorded on original birth registrations and midwives’ notifications. These relationships are usually (but not always) biological. No information is known about adoptions, including step, local or overseas adoptions.

Currently, the genealogy held by the Department of Health includes parents and siblings of people born in WA since 1945. Extended family members (including grandparents, grandchildren, cousins, aunts and uncles) can also be identified.

The availability of family connections information arose from the WA Family Connections Project, which was started in 2003. 

Research capabilities

Population-based genealogies are rare due to the challenges of developing and maintaining such a resource on a large scale. The combination of genealogy and health data for the WA population provides a unique opportunity to investigate the inheritance of human disease.

Data may be used to assess the degree of relatedness of individuals within study samples, locate common ancestors, estimate genetic risk or describe the familial burden of comorbid conditions.

Geocoding

Geocoding is a process that involves converting an address into a latitude and longitude coordinate, using a set of reference data. This map point can then be placed within spatial boundaries such as the statistical area levels 1 and 2 (SA1 and SA2 respectively) and Local Government Area (LGA).

Data Linkage Services assigns the boundaries and derives the indices using mapping and concordance tables created by the Australian Bureau of Statistics (ABS). More information can be found on the ABS website in the census reference area.

The Department of Health currently has geocoded data for all census years from 1996 until 2021.

  • Midwives Notifications
  • Hospital Morbidity Data Collection Data
  • Emergency Department Data Collections
  • Death Registrations
  • Mental Health Information System
Matched comparison group selection

Data Linkage Services can select comparison or control populations for study cohorts to facilitate case-control studies.

Controls can be selected to meet several criteria, including:

  • demographic characteristics (e.g. year of birth, sex)
  • location (postcodes, SEIFA or other geographical features)
  • clinical outcomes or characteristics (e.g. admitted to hospital within the same year and month).

Controls are usually frequency matched, however if required can be individually matched to cases.

Data sources

For adult cases, controls are usually selected from the electoral roll, such that each control was a current elector during the year in which a corresponding case had their “index event”. This does not guarantee they were resident in WA at the time but makes it more probable.

For child cases, controls can be selected from Birth Registrations or from midwives’ notifications. Postcode matching is not very reliable in these cases – the address on the midwives Notification of Case Attended Form is not necessarily the mother’s usual address and not all birth registrations have the parents’ address. There is no way to ensure that children selected as controls were still resident in WA on the case’s index date.

Controls may be selected from other datasets, if appropriate for the given study.

Standard exclusions

Most often cases are excluded from being controls. Stillborn children will be excluded from controls selected from midwives/births unless specifically requested otherwise. It is also possible to exclude known relatives via the Family Connections System.

It is also possible to exclude people based on information in other datasets, e.g. exclude all women who are known to have had a hysterectomy, or exclude men who had lung cancer in a certain time period.

Control to case ratio

Data applicants should advise ISPD Client Services of their required ratio, noting that a high number of controls (e.g. 10:1 or above) may require justification. In some cases, the matching criteria may need to be relaxed to find a suitable control (e.g. where a large number is requested or the criteria are highly specific).

Sample selections

For some requests, a sample of identifying information selected from the Western Australian Electoral Roll can be provided.

These samples can be selected based on a variety of characteristics, and can be used for various purposes, including study invitations, pursuant to the appropriate approvals being granted.

Privacy preserving record linkage

The WA Department of Health implements a ‘privacy by design’ model for data linkage, which utilises the 'separation principle' (see Privacy) and clear text identifiers to ensure the highest quality matches whilst protecting personal privacy. In instances where clear text identifiers cannot be provided, Privacy Preserving Record Linkage (PPRL) may be a viable option.

PPRL is a linkage methodology that preserves the privacy of individuals through specialised software to irreversibly ‘hash’ demographic variables and derive a new string. This new string enables record linkage but does not identify an individual, which significantly reduces the risk of sharing record-level data.

PPRL can be used to support data sharing and integration for government services, research, and other initiatives to improve health outcomes at state and national levels, while adhering to the Australian Privacy Principles.

The Department of Health can provide hashed data for record linkage at third party linkage agencies.

For further information, please view the Privacy Preserving Record Linkage (PPRL) Guide.

Contact us

The Client Services team will assist with any questions or concerns you might have – get in touch today to see how we can work together to help make your project a reality.

Feedback survey

If you have recently received data from Data Linkage Services, please complete the WA Department of Health Data Services Client Feedback SurveyFeedback from service users will be used to capture and proactively address common user issues.

The survey will take approximately 5 minutes to complete.

 
Complete the survey