Search Results
6/23/2025, 11:45:19 AM
How do I learn to write half-decent queries?
My employer recently gave me access to the company datamart, and I've been able to put together some useful reports/extracts, but I really have no idea what the fuck I'm doing beyond using left joins between dimension tables and fact tables (plus or minus a few inner joins to subqueries) until the data is broadly in the general shape I need it.
eg. We have this very tall table containing customer data which for some reason is uploaded by the source system as unpivoted- kind of like this:
>[NATURALKEY], [CLIENTDATACODE], [VALUE], [OBJECTID], [CREATEDWHEN], etc.
>8974617, Client_Surname_SYS, KIMBALL, 524215, 2025-02-17 16:42:57.00
>8974618, Client_Forename_SYS, Ralph, 524215, 2025-02-17 16:42:57.00
>8974619, Client_Gender_SYS, Male, 524215, 2025-02-17 16:42:57.00
>etc
>8974645, Client_Postcode_SYS, 60098, 524215, 2025-02-17 16:42:57.00
>8974646, Client_Surname_SYS, ROSS, 524216, 2025-02-18 08:23:31.00
>etc etc.
I worked out I could pivot it back into columns by using a function combo of MIN(IFF([CLIENTDATACODE]='Client_Surname_SYS', [VALUE], NULL)) AS "Client_Surname" via GROUP BY but the table is so big that the performance was shit, until I also found I could make a mini dimension table using this other function called VALUES ('Client_Surname_SYS'), ('Client_Forename_SYS'), etc. and inner join that onto the aforementioned before the GROUP BY which seemed to encourage to the query optimiser (this is referring to Snowflake VQL by the way) to throw out all the millions of rows I don't need or care about.
Probably not a great example but if you have any advice or tips/tricks I'd really appreciate it.
My employer recently gave me access to the company datamart, and I've been able to put together some useful reports/extracts, but I really have no idea what the fuck I'm doing beyond using left joins between dimension tables and fact tables (plus or minus a few inner joins to subqueries) until the data is broadly in the general shape I need it.
eg. We have this very tall table containing customer data which for some reason is uploaded by the source system as unpivoted- kind of like this:
>[NATURALKEY], [CLIENTDATACODE], [VALUE], [OBJECTID], [CREATEDWHEN], etc.
>8974617, Client_Surname_SYS, KIMBALL, 524215, 2025-02-17 16:42:57.00
>8974618, Client_Forename_SYS, Ralph, 524215, 2025-02-17 16:42:57.00
>8974619, Client_Gender_SYS, Male, 524215, 2025-02-17 16:42:57.00
>etc
>8974645, Client_Postcode_SYS, 60098, 524215, 2025-02-17 16:42:57.00
>8974646, Client_Surname_SYS, ROSS, 524216, 2025-02-18 08:23:31.00
>etc etc.
I worked out I could pivot it back into columns by using a function combo of MIN(IFF([CLIENTDATACODE]='Client_Surname_SYS', [VALUE], NULL)) AS "Client_Surname" via GROUP BY but the table is so big that the performance was shit, until I also found I could make a mini dimension table using this other function called VALUES ('Client_Surname_SYS'), ('Client_Forename_SYS'), etc. and inner join that onto the aforementioned before the GROUP BY which seemed to encourage to the query optimiser (this is referring to Snowflake VQL by the way) to throw out all the millions of rows I don't need or care about.
Probably not a great example but if you have any advice or tips/tricks I'd really appreciate it.
Page 1