top of page

Most Commonly Asked SQL Interview Topics - Part 1



Hi All!! Hope you all had a great weekend!! As part of next series, I have planned to cover all the most commonly asked sql topics for data analyst/data science interview. The series is split into different parts that will be posted every week on Monday. Hope you like the content!

Most Commonly Asked SQL Interview Topics:


1. CASE WHEN:

CASE WHEN statements are used often when we need to manipulate data as per the analysis. It can either be count of the column or segregate the data statistics as per the column. The three different type of uses of CASE WHEN statements:

1. A simple case statements:
Case when if gender = ‘M’ or gender = ‘F’ then ‘YES’ else ‘NO’ 		  END as col_name
2. With numbers:
 Case when if score >=90 then ‘A’
 When score >=80 and score <90 then ‘B’
 When score >=60 and score <80 then ‘C’
 Else ‘D’ end as grade
3. With aggregation:
 count(case when gender = ‘M’ then 1 else 0) end as male_count?

2. Self joins:

A SELF JOIN is simply when you join a table with itself. There is no SELF JOIN keyword, you just write an ordinary join where both tables involved in the join are the same table.

Note: One thing to notice is that when you are self joining it is necessary to use an alias for the table otherwise the table name would be ambiguous.

SELECT T2.name
FROM category T1
JOIN category T2
ON T2.parent = T1.id
WHERE T1.name = 'Kitchen'

3. Distinct and Group By:

DISTINCT, as the name says filters out distinct/unique combination of values for the columns specified in the SELECT query. GROUP BY, is used in collaboration with the SELECT statement to arrange/partition identical data into groups. It is typically used with aggregate functions like COUNT, MIN, MAX, AVG, SUM. There is one more use of GROUP BY.


GROUP BY behaves like DISTINCT when you don’t use any aggregate function in the SELECT query.

DISTINCT collects all of the rows (including any expressions) without any pre-aggregation and tosses out duplicates.

GROUP BY, aggregates the result set and then displays the distinct values.


Another difference is their logical order of operations. Group by operation takes place first logically whereas Distinct operation is solved logically later.

Here are the logical order of the most important operations:
FROM (including JOIN, APPLY, etc.)
WHERE
GROUP BY (can remove duplicates)
Aggregations
HAVING
Window functions
SELECT
DISTINCT (can remove duplicates)
UNION, INTERSECT, EXCEPT (can remove duplicates)
ORDER BY
OFFSET
LIMIT

Hence, GROUP BY has better performance!


4. Left vs outer joins :

Left join: Returns all the rows from the LEFT table and matching records between both the tables.

In terms of efficiency, left join is not faster than inner join however it much better than outer join. It has to perform all the work of an inner join along with the additional work of null-extending the results.

Outer join:

It combines the result of the left join and right join.

In terms of performance, it is definitely logical that outer join returns a large number of records which further increases its total execution time.



5. Union:

UNION is one of the set operation where deduplication step is involved

It is used to combine results from two or more select statements but one criteria.

  • The number of column between the select statements used should be same.

  • Also the columns have to be similar data types, and in same order.

where as, UNION ALL is similar to union except that union all allows duplicates rows as well.


Resources For SQL Practice:


Part 2 will be out next week. If you like this content, please like, share and subscribe!!


Python Series:


Comments


bottom of page