Honestly, I do not know why did I start my data science/ engineering journey.
I guess I was purely thinking about how engineers make a lot of money, and data science was hot and trendy in 2021, so I just decided to be one. Honestly, study front-end engineering might have been much useful. However, I was thinking that there will be a job in the future which might require high-end design skill and high-end data science skill just like UX engineer. Also, I love to find out human behaviors and AI is coooool~!
Anyway, here are my notes to SQL for what I have learned so far.
FIRST WEEK
select* from
(Select) is used for bringing data.
ex) select order_no,created_at, user_id, email from orders;
You can filter by adding names of the columns in * or filtering by adding (Where)
ex) select * from point_users
where point >= 5000;
SECOND WEEK
Group by, order by
(Group by) is used for literal to group the items.
ex) select * from point_users
where point >= 5000;
(Order by) is used to organized by numbers or alphabets
You can add either desc, asc to change the order.
ex) select name, count(*) from users
where email like ‘%gmail.com’
group by name
THIRD WEEK
Join
select * from orders o
inner join users u on o.user_id = u.user_id;
FOURTH WEEK
Subquery
with table1 as(select course_id, count(distinct(user_id)) as cnt_checkins from checkins
group by course_id),
table2 as (select course_id, count(*) as cnt_total from orders
group by course_id
) select c.title, a.cnt_checkins, b.cnt_total,
(a.cnt_checkins/b.cnt_total) as ratio
from table1 a
inner join table2 b on a.course_id = b.course_id
inner join courses c on a.course_id = c.course_id;
Substring
select SUBSTRING(created_at,1,10) as date, count(*) from orders
group by date;
with table1 as(select pu.user_id, pu.point,
(case when pu.point > 10000 then ‘10k’
when pu.point > 5000 then ‘5K’
else ‘less than 5k’end) as lv
from point_users pu
)
select a.lv, count(*) from
table1 a
group by a.lv;
Tips