Quotes

Saturday, April 4, 2020

SQL basics, joins, agg functions



1) Why data is kept in different table, why not a single table with all the data?

- Speed
- fact tables(orders, sales, invoices), dimensional tables(customer data)


Database Normalization

There are essentially three ideas that are aimed at database normalization:
  1. Are the tables storing logical groupings of the data?
  2. Can I make changes in a single location, rather than in many tables for the same information?
  3. Can I access and manipulate data quickly and efficiently?

Having column- is useful when you want to have a similar to where condition on aggregated column which is not possible with WHERE keyword

wrong: where SUM(room_nights)> 10

Correct way: :Having SUM(room_nights)> 10

SELECT COUNT(*) num_reps_above5
FROM(SELECT s.id, s.name, COUNT(*) num_accounts
     FROM accounts a
     JOIN sales_reps s
     ON s.id = a.sales_rep_id
     GROUP BY s.id, s.name
     HAVING COUNT(*) > 5
     ORDER BY num_accounts) AS Table1;



CASE statement examples:

SELECT a.name, SUM(total_amt_usd) total_spent, 
     CASE WHEN SUM(total_amt_usd) > 200000 THEN 'top'
     WHEN  SUM(total_amt_usd) > 100000 THEN 'middle'
     ELSE 'low' END AS customer_level
FROM orders o
JOIN accounts a
ON o.account_id = a.id
WHERE occurred_at > '2015-12-31' 
GROUP BY 1
ORDER BY 2 DESC;

Date_Trunc and DATE_PART functions

SELECT DATE_TRUNC('month', o.occurred_at) ord_date, SUM(o.gloss_amt_usd) tot_spent
FROM orders o 
JOIN accounts a
ON a.id = o.account_id
WHERE a.name = 'Walmart'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 1;



SELECT DATE_PART('year', occurred_at) ord_year,  SUM(total_amt_usd) total_spent
 FROM orders
 GROUP BY 1
 ORDER BY 2 DESC;


case query:

select case
when condition1 then ''
when condition2 then '' end
from table1;



select case
when condition* then case when condition1 then ''
                     case when condition2 then '' end
then
else '' end
from table1



select case
when a+b>c then
case 
when a=b and b=c then 'Equilateral'
when a=b or b=c or a=c then 'Isosceles'
when a!=b and b!=c then 'Scalene' end
else 'Not A Triangle' end

from triangles;

SET @number = 21;

SELECT REPEAT('* ', @number := @number - 1) FROM information_schema.tables WHERE @number > 0;

No comments:

Post a Comment