Intro
This topic lists a number of data clean-up operations available through SQL and Magic ETL DataFlows and shows how data will look both before and after using an operation.Changing Your Data Type
Example: Changing the data type of an ID number from a numeric to a text field| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
CAST( | Use the “Set Column Type” tile | 1 | 1 (looks the same but behaves a text dimension) |
Concatenating Columns to Create a Compound Field
Example: Concatenating “First Name” and “Last Name” columns to create a “Full Name” column| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
CONCAT( | Use the “Combine Columns” tile | ‘John’ | ‘Smith’ (first and last name in two separate columns) | ‘John Smith’ (a single column containing the complete name) |
Extracting a Portion of a Text String
Example: Extracting the first part of an email address to use as a user ID| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
SUBSTRING_INDEX( | Use the “Replace Text” tile on that column to specify which portion of the string should be replaced with an empty string: @.+ | userid@email.com | userid |
Reformatting a Date
Example: Formatting a non-standard date string into a date type format| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
STR_TO_DATE( | Use the “Set Column Type” tile. | 23.01.2017 (string data type) | 01/23/2017 (date data type) |
Deriving Date Attributes from a Date Column
Example: Extracting the day of the week from a date column| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
DAYNAME( | Use the “Date Operations” tile. | 01/23/2017 | Monday |
Splitting a Column into Two Columns Based on a Character in the Column
Example: Dividing a “Status Code” column into status code parts based on the / delimiter found within the column| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
SUBSTRING_INDEX( | Use the “Replace Text” tile on that column to specify which portion of the string should be replaced with an empty string: /.+ for the first part and .+/ for the second. | SHI/DELV | SHI | DELV |
Trimming Erroneous Spaces from a Column
Example: Trimming the leading and trailing spaces from the “Department” column| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
TRIM( | Use a regular expression within the “Replace Text” tile to pinpoint the leading and training spaces and replace them with nothing. | ’ department ’ | ‘department’ |
Changing the Case of an Entire Column
Example: Changing the “Category” column to uppercase letters| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
UPPER( | Use the All upper case option within the “Text Formatting” tile. | health | HEALTH |
Capitalizing the First Letter of a Column
Example: Capitalizing the first letter of the first word in the “Category” column| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
CONCAT(UPPER(LEFT( | Use the Capitalize first letter option within the “Text Formatting” tile. | health | Health |
Categorizing Rows Based off the Value in a Specific Column
Example: Assigning a region to each row based on a store number| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
(CASE WHEN | Use the Filter Rows tile to separate into distinct groups based on store number, use Add Constants to add a category to each group, then use Append Rows to stitch the rows back together. | store_number = 100 | store_number = 100 | region = ‘region_1’ |
Categorizing Rows Based off the Value in a Specific Column
Example: Assigning a category to the animal name based on the first letter of the name| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
(CASE WHEN | Use a regular expression within the “Replace Text” tile to identify the patterns and assign a category. For example ^H.+ finds names starting with H, ^B.+ finds names starting with B, ^A.+|^E.+ finds names starting with either A or E, and ^[^H,B,A,E].+ finds names that do not start with H, B, A, or E. | animal_names = ‘Horse’ | animal_names = ‘Horse’ | animal_names_categories = ‘Hs’ |
Categorizing Rows Based off a Numeric Value Threshold
Example: Categorizing rows based on salary amount| In MySQL… | In Magic ETL… | Before | After |
|---|---|---|---|
(CASE WHEN | Use the “Filter Rows” tile to separate into distinct groups based on store number, use “Add Constants” to add a category to each group, then use “Append Rows” to stitch the rows back together. | salary = $45,000 | salary = $45,000 | salary_bucket = ‘$50,000’ |