ODAT Logo

Chapter 1: Automating data cleansing, processing, and dataset creation

User-friendly GUI toolkits for cleansing, standardizing, and generating structured datasets from raw long-format medical data.

Tool 1

About Tool 1

This GUI-based application streamlines the cleaning and reformatting of raw, long-format Hospital Authority (HA) Hong Kong medical data into standardized CSV and XLSX outputs. Designed with simplicity in mind, it empowers researchers to:

Particularly valuable for studies focusing on Chinese-specific cohorts, comorbidity patterns, temporal trends, and regional health outcomes.

Tool 2

About Tool 2

This GUI application lets you select multiple CSV files, automatically merge only their common columns into one combined dataset, and save the result as a new CSV via a simple GUI. Designed with simplicity in mind, it empowers researchers to:

Tool 3

About Tool 3

This GUI application lets you open a CSV and automatically remove duplicate reference keys, saving the result as a new CSV via a simple GUI.

Tool 4

About Tool 4

This GUI application lets you open a CSV and automatically drop irrelevant columns, streamlining your dataset for focused analysis.

Tool 5

About Tool 5

This GUI application lets you open a CSV and automatically change its encoding to UTF-8, ensuring compatibility across platforms.

Tool 6

About Tool 6

This GUI application lets you create full dx events based on ICD-9 codes for a large number of diseases from HA raw diagnostic data.

Tool 7

About Tool 7

This GUI application lets you check match counts between two CSV files based on specific columns, providing quick insights into overlapping data.

Tool 8

About Tool 8

This GUI application automates CSV cleaning by standardizing headers, extracting numeric data, handling missing values, and generating reports, then saves processed files to a designated folder.

Tool 9

About Tool 9

This GUI application lets you remove columns where all values are zero, helping to declutter and optimize your datasets.

Tool 10

About Tool 10

This GUI-based application allows you to load CSV file, identify missing values and apply various imputation techniques.

← Back to Home