Most Commonly Asked SQL Interview Topics - Part 1

Neema MV
Aug 2, 2021
3 min read

Updated: Aug 9, 2021

Hi All!! Hope you all had a great weekend!! As part of next series, I have planned to cover all the most commonly asked sql topics for data analyst/data science interview. The series is split into different parts that will be posted every week on Monday. Hope you like the content!

Most Commonly Asked SQL Interview Topics:

1. CASE WHEN:

CASE WHEN statements are used often when we need to manipulate data as per the analysis. It can either be count of the column or segregate the data statistics as per the column. The three different type of uses of CASE WHEN statements:

1. A simple case statements:
Case when if gender = ‘M’ or gender = ‘F’ then ‘YES’ else ‘NO’ 		  END as col_name
2. With numbers:
 Case when if score >=90 then ‘A’
 When score >=80 and score <90 then ‘B’
 When score >=60 and score <80 then ‘C’
 Else ‘D’ end as grade
3. With aggregation:
 count(case when gender = ‘M’ then 1 else 0) end as male_count?

2. Self joins:

A SELF JOIN is simply when you join a table with itself. There is no SELF JOIN keyword, you just write an ordinary join where both tables involved in the join are the same table.

Note: One thing to notice is that when you are self joining it is necessary to use an alias for the table otherwise the table name would be ambiguous.

SELECT T2.name
FROM category T1
JOIN category T2
ON T2.parent = T1.id
WHERE T1.name = 'Kitchen'

3. Distinct and Group By:

DISTINCT, as the name says filters out distinct/unique combination of values for the columns specified in the SELECT query. GROUP BY, is used in collaboration with the SELECT statement to arrange/partition identical data into groups. It is typically used with aggregate functions like COUNT, MIN, MAX, AVG, SUM. There is one more use of GROUP BY.

GROUP BY behaves like DISTINCT when you don’t use any aggregate function in the SELECT query.

DISTINCT collects all of the rows (including any expressions) without any pre-aggregation and tosses out duplicates.

GROUP BY, aggregates the result set and then displays the distinct values.

Another difference is their logical order of operations. Group by operation takes place first logically whereas Distinct operation is solved logically later.

Here are the logical order of the most important operations:
FROM (including JOIN, APPLY, etc.)
WHERE
GROUP BY (can remove duplicates)
Aggregations
HAVING
Window functions
SELECT
DISTINCT (can remove duplicates)
UNION, INTERSECT, EXCEPT (can remove duplicates)
ORDER BY
OFFSET
LIMIT

Hence, GROUP BY has better performance!

4. Left vs outer joins :

Left join: Returns all the rows from the LEFT table and matching records between both the tables.

In terms of efficiency, left join is not faster than inner join however it much better than outer join. It has to perform all the work of an inner join along with the additional work of null-extending the results.