We use cookies to ensure you have the best browsing experience on our website. Please read our cookie policy for more information about how we use cookies.
I came up with the exact solution,except my JOINs are in a different order. Using both my solution and copy pastying this solution I get the same error:
Could someone please shed some light into this? Also, is the "c.difficulty_level = d.difficulty_level" in the WHERE clause neccessary? Isn't that "taken care of" from the JOIN condition?
EDIT: Changing to MySQL resulted in a working code.
yes same for me, also can you please explain why we did s.score=d.score? since by doing that we will be removing some scores like 77, 47 from the submissions table.
s.score is the score of an individual
d.score is the FULL score of that difficulty level.
As the questions asks about only those who scored Full Marks thus s.score= d.score. It only gives the table of those people who got full scores only.
I don't see a need to group by Name theoretically unless there are multiple names for same Hacker ID. The test fails unless you group by Name. Without seeing what is in the test input, I really can't tell what this is about.
You need to group by name, because most DBMSs require each column in the select clause to also be in the group by clause. And you need the name in the select cause because the problems asks for it.
SELECT h.hacker_id, h.name
FROM submissions s
JOIN challenges c
ON s.challenge_id = c.challenge_id
JOIN difficulty d
ON c.difficulty_level = d.difficulty_level
JOIN hackers h
ON s.hacker_id = h.hacker_id
WHERE s.score = d.score
AND c.difficulty_level = d.difficulty_level
GROUP BY h.hacker_id
HAVING COUNT(s.hacker_id) > 1
ORDER BY COUNT(s.hacker_id) DESC, h.hacker_id ASC;
SELECT h.hacker_id, h.name FROM submissions s JOIN challenges c ON s.challenge_id = c.challenge_id JOIN difficulty d ON c.difficulty_level = d.difficulty_level JOIN hackers h ON s.hacker_id = h.hacker_id WHERE s.score = d.score AND c.difficulty_level = d.difficulty_level GROUP BY h.hacker_id,h.name HAVING COUNT(s.hacker_id) > 1 ORDER BY COUNT(s.hacker_id) DESC, h.hacker_id ASC;
this is working after adding h.name for SQL Server
When the order of inner join doesn't matter, I don't understand why my code below doesn't work. Can you help?
SELECT h.hacker_id, h.name
FROM hackers h join challenges c on h.hacker_id = c.hacker_id
join difficulty d on c.difficulty_level = d.difficulty_level
join submissions s on s.hacker_id = h.hacker_id
where s.score = d.score and c.difficulty_level = d.difficulty_level
group by h.hacker_id, h.name
having count(s.hacker_id) > 1
order by count(s.hacker_id) desc, h.hacker_id asc
You cannot join table Hackers and Challeges by challenge_id.
These tables cannot be logically joined.
Since the table Challenges only contain the hackers that have "created" the problem you are eleminating so in the final join of all the tables the hackers who did not create any problems will be eleminated.
same to me, that code is not efficient yet, it still makes redundance code.
This one is my code:
SELECT s.hacker_id, h.name
from submissions s
inner join challenges c
on s.challenge_id = c.challenge_id
inner join difficulty d
on c.difficulty_level = d.difficulty_level
inner join hackers h
on s.hacker_id = h.hacker_id
where s.score = d.score
group by s.hacker_id, h.name
having count(s.hacker_id) > 1
order by count(s.hacker_id)desc, s.hacker_id
Why should we use where c.difficulty_level = d.difficulty_level as we have used it already to get the join on c.difficulty_level = d.difficulty_level ?
select h.hacker_id,h.name from submissions s join challenges c on s.challenge_id = c.challenge_id
join difficulty d on c.difficulty_level = d.difficulty_level join hackers h on s.hacker_id = h.hacker_id where s.score = d.score group by h.hacker_id,h.name having count(s.hacker_id)>1 order by count(h.hacker_id) desc,h.hacker_id asc;
FROM SUBMISSIONS
JOIN CHALLENGES ON SUBMISSIONS.CHALLENGE_ID = CHALLENGES.CHALLENGE_ID
JOIN DIFFICULTY ON DIFFICULTY.DIFFICULTY_LEVEL = CHALLENGES.DIFFICULTY_LEVEL
JOIN HACKERS ON HACKERS.HACKER_ID = SUBMISSIONS.HACKER_ID
WHERE SUBMISSIONS.score = DIFFICULTY.score
GROUP BY SUBMISSIONS.hacker_id, HACKERS.NAME
HAVING COUNT(SUBMISSIONS.hacker_id) > 1
ORDER BY COUNT(SUBMISSIONS.hacker_id) DESC, SUBMISSIONS.hacker_id ASC
select temp.hack, temp.name
from
(select c.hacker_id as hack, h.name as name, count(c.challenge_id) as chid
from
challenges c join difficulty d
on c.difficulty_level = d.difficulty_level
join submissions s
on c.challenge_id = s.challenge_id and c.hacker_id = s.hacker_id
join hackers h
on s.hacker_id = h.hacker_id
where
d.score = s.score
group by
c.hacker_id, h.name) temp
order by temp.chid desc, 1 asc
agreed. given that soooooo many people had this redundant line in their codes (which are posted here), wonder if they borrowed the code from the top comment and reposted as if it's their own version. the redundancy is kinda obvious, even for a novice, so it's kinda surprising that soooo many people made the same mistake.
Redundant because they're doing a join with the difficulty_level earlier in the solution so using it in the where statement isn't required since it's already applied
Correct, one can attampt challenege having difficulty level 4 (of score 60) and might get only 30, so it will confuse whether this 30 belongs to full score of defficulty level 2 or partial score of some other difficulty level.
NO this line is not redudant because imagine you have the same total score in difficulty table then it would create a conflict so its neccesary to match diff.level also bro
Having without Group By . Actually it does but returns 1 or 0 based on the system you run it on, that a having with a group by is like group by () which is implicit.
Could be wrong, but I think if we don't use GROUP BY then there would likely be repeated entries for hackers who made top scores on more than one challenge
The ability to do something doesn't it make it the best or right way to do it though. The 'on' clauses are what we base our table joins on. The where clause is a filter on the joined tables and as such should be the proper place for conditions set on the resultSet.
Hi! Thanks for sharing the solution.
I am a beginner at learning SQL. I can't understand why we are selecting from submissions instead of Hackers. Can anybody guide me here?
Thanks
select s.hacker_id, h.name from hackers as h
inner join submissions as s on h.hacker_id = s.hacker_id
inner join challenges as c on s.challenge_id = c.challenge_id
inner join difficulty as d on c.difficulty_level = d.difficulty_level
where s.score = d.score and c.difficulty_level = d.difficulty_level
group by h.hacker_id, h.name
having count(h.hacker_id) > 1
order by count(h.hacker_id) desc, h.hacker_id;
Could anyone exlain why use h.hacker_id instead of Hackers.hacker_id? If you did not show that Hackers as h, how could you use h. to represents Table Hackers?
What's wrong with this code? INNER JOIN ,being commutative, should work even if the order of the join statements is altered.,Shouldn't it?
select s.hacker_id,h.name
from hackers as h
inner join challenges as c on
h.hacker_id=c.hacker_id
inner join difficulty as d on
d.difficulty_level=c.difficulty_level
inner join submissions as s on
s.challenge_id=c.challenge_id
where d.score=s.score and c.difficulty_level= group by s.hacker_id,h.name
having count(s.hacker_id)>1
order by count(s.hacker_id) desc,s.hacker_id;
Challenges table does not have all the hacker_id entries. Submissions table and Hacker table does. So if we join tables based on h.hacker_id=c.hacker_id then we are missing out on hacker_ids that are not present in Challenges table.
Trying joining Challenges table to Submissions table using challenge_id. It worked for me.
Also, once you have joined the tables it doesnt really matter if you use h.hacker_id or c.hacker_id or s.hacker_id in your ORDER BY clause. It will all work out the same. Atleast for hackerrank it does.
See below:
SELECT h.hacker_id, h.name
FROM SUBMISSIONS s
JOIN HACKERS h
ON s.hacker_id = h.hacker_id
JOIN CHALLENGES c
ON s.challenge_id = c.challenge_id
JOIN DIFFICULTY d
ON c.difficulty_level = d.difficulty_level
WHERE s.score = d.score
GROUP BY h.hacker_id, h.name
HAVING COUNT(h.hacker_id) > 1
ORDER BY COUNT(h.hacker_id) DESC, h.hacker_id
This is a catch, which got me first. We should ignore hacker_id in challenges table because of its meaing: it's the id of the hacker, who created the challenge, not who took the challege. It's a reference table, and it's important to understand its grain/primary key. So, if you use this hacker_id for joins, you'll end up having only a handful of hackers who created challenges.
select h.name,s.hacker_id,s.score from Submissions s
join Challenges c on s.challenge_id = c.challenge_id
join Difficulty d on d.difficulty_level = c.difficulty_level
join Hackers h on h.hacker_id = s.hacker_id
where d.score = s.score
group by s.hacker_id,s.score,h.name
having count(s.hacker_id) > 1
order by s.score desc,s.hacker_id asc;
select z.hacker_id,z.name from
(select h.name,h.hacker_id from Submissions s
inner join Challenges c on s.challenge_id = c.challenge_id
inner join Difficulty d on d.difficulty_level = c.difficulty_level
inner join Hackers h on h.hacker_id = s.hacker_id
and d.score = s.score
group by h.hacker_id,h.name
having count(h.hacker_id) > 1
order by h.hacker_id asc)z;
he did not. there are 5 tables joined together, the id and name are from table h. it's just that the first table in the chain of joins happens to be submission, but it doesnt me that he picked the two columns from submission. the table where id and name are from is dictated by "h.", but not the first table in the joins.
I have almost the same code as yours except that I join submissions and challenges tables on two coditions: s.challenge_id = c.challenge_id and a.hacker_id = c.hacker_id.
I did so because chanllenges table does not suggest difficulty_level is related to chanllenge_id only, it might be related to tracker_id as it shows in sample table. I think my code is more conservative in this sense. However, it gives wrong output.
The question needs to either remove tracker_id column from challenge table or specify challenge_id is unique key or make the real challenge table complete with all tracker_id conditions.
same question. Theoretically inner join doesn't discriminate joined table orders. But it produce wrong asnwer in this case. Here's my code. Anyone can help to shed some light on this question?
SELECT h.hacker_id, h.name
FROM Hackers h JOIN Challenges c
on h.hacker_id=c.hacker_id
JOIN submissions s
on h.hacker_id=s.hacker_id
JOIN difficulty d
on d.difficulty_level= c.difficulty_level
where d.score = s.score AND d.difficulty_level=c.difficulty_level
GROUP BY hacker_id,h.name
HAVING COUNT(s.hacker_id)>1
ORDER BY COUNT(s.hacker_id)DESC,s.hacker_id ASC;
why are we imposing the condition
c.difficulty_level = d.difficulty_level
in the where clause again? Is this not redundant?
As we are already using this condition in our join.
And can someone explain why the order of joins important? because I tried with exact same solution but different order of joins and I got a wrong answer.
The order of joins is flexible, but some of the ON id-matching constraints refer to previous joins. So if you're referring to any of them then they do have to be in the order that they're declared/called.
Here is an alternate way to do this sort of joining. It makes more sense to me and eliminates a bit of the INNER JOIN repetition.
FROM Hackers H
JOIN (Submissions as S,
Challenges as C,
Difficulty as D)
ON (S.Hacker_ID = H.hacker_ID
AND S.Challenge_ID = C.Challenge_ID
AND C.Difficulty_Level = D.Difficulty_Level)
select hacker_id,name from hackers h where h.hacker_id in(select hacker_id from challenges where score in(select max(score) from submissions))group by h.hacker_id,having count(h.hacker_id)>1 order by h.hacker_id asc;
Could you please explain the having count(h.hacker_id)>1 execution? I thought that having clauses get evaluated after the grouping is done. And if you group on hacker_id, name, won't you simply end up with a table filled with only distinct hacker_id's and names? Performing counts on this should then result in a single count for each hacker_id/name.
Where am I getting confused? Is it because of the aggregate count function in the order clause that changes the intermediate evaluation?
Thanks
Great code, thank you! One thing to note is that if you change WHERE to AND in
WHEREs.score=d.score
and move it up right after the second JOIN...ON..., it should run faster, as it would use the condition in a join, not after all the joins. This is my version:
I have submitted this code
select h.hacker_id, h.name
from submissions s
inner join challenges c
on s.challenge_id = c.challenge_id
inner join difficulty d
on c.difficulty_level = d.difficulty_level
inner join hackers h
on s.hacker_id = h.hacker_id
where s.score = d.score and c.difficulty_level = d.difficulty_level
group by h.hacker_id, h.name
having count(s.hacker_id) > 1
order by count(s.hacker_id) desc, s.hacker_id asc
I got the result. I cannot explain query because I have submiited this queries 2 years ago. Sorry for explainaton. I hope you understand.
and what if one hacker has more than one submission to the same challenge with the max score?
this case doesn't happen in the tests, but I think the text is not clear with that.
Your code throws an error, as h.name is not functionally dependent on the columns in the group by clause (h.name is not included in the group by clause) and to remove the redundancy just remove this condition, c.difficulty_level = d.difficulty_level. That should work. Here is my code, that worked fine for me,
SELECT H.HACKER_ID, H.NAME
FROM SUBMISSIONS S
INNER JOIN CHALLENGES C
ON S.CHALLENGE_ID = C.CHALLENGE_ID
INNER JOIN DIFFICULTY D
ON C.DIFFICULTY_LEVEL = D.DIFFICULTY_LEVEL
INNER JOIN HACKERS H
ON S.HACKER_ID = H.HACKER_ID
WHERE S.SCORE = D.SCORE
GROUP BY H.HACKER_ID, H.NAME
HAVING COUNT(H.HACKER_ID) > 1
ORDER BY COUNT(H.HACKER_ID) DESC, S.HACKER_ID ASC
ERROR OCUURS IN MYSQL :
"
ERROR 1055 (42000) at line 1: Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'run_88k14m9gmm5.h.name' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
"
In where clause, is this not redundant "AND c.difficulty_level = d.difficulty_level" as this is already covered by joins as "ON c.difficulty_level = d.difficulty_level "
Can someone tell me why are s.challenge_id = c.challenge_id and
c.difficulty_level = d.difficulty_level equated?
Why are we joining on basis of these two conditions?
I think its the error of HackerRank
In mysql, it's okay not to use aggregation function in select clause when the field is in group by clause.
but in this system, not using aggreation function caused error.
here is my solution
SELECT h.hacker_id, max(h.name)
FROM submissions s
JOIN challenges c
ON s.challenge_id = c.challenge_id
JOIN difficulty d
ON c.difficulty_level = d.difficulty_level
JOIN hackers h
ON s.hacker_id = h.hacker_id
WHERE s.score = d.score
AND c.difficulty_level = d.difficulty_level
GROUP BY h.hacker_id
HAVING COUNT(h.hacker_id) > 1
ORDER BY COUNT(h.hacker_id) DESC, h.hacker_id ASC
Can someone please tell me why my join is not working?
select hackers.hacker_id,hackers.name from hackers INNER JOIN submissions on hackers.hacker_id=submissions.hacker_id INNER JOIN difficulty on submissions.score=difficulty.score INNER JOIN challenges on difficulty.difficulty_level=challenges.difficulty_level where submissions.score = difficulty.score and challenges.difficulty_level = difficulty.difficulty_level group by hackers.hacker_id, hackers.name having count(submissions.hacker_id) > 1 order by count(submissions.hacker_id) desc, submissions.hacker_id asc
If you try to join top to bottom, vs bottom to top, which is what you have done, why does that not work?
i.e. join
a. hackers to challenges
b. challenges to difficulty
c. challenges to submissions?
in group by there is a problem. to remove it on the line group by
write
GROUP BY h.hacker_id,h.name
then after it work in mysql.
because with group by clause we can use aggregation function and the attribute by which we want to group the data
ERROR 1055 (42000) at line 2: Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'run_9oquik1d0bl.h.name' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
But after adding h.name in GROUP BY it worked.
SELECT h.hacker_id, h.name
FROM submissions s
JOIN challenges c
ON s.challenge_id = c.challenge_id
JOIN difficulty d
ON c.difficulty_level = d.difficulty_level
JOIN hackers h
ON s.hacker_id = h.hacker_id
WHERE s.score = d.score
AND c.difficulty_level = d.difficulty_level
GROUP BY h.hacker_id, h.name
HAVING COUNT(s.hacker_id) > 1
ORDER BY COUNT(s.hacker_id) DESC, s.hacker_id ASC;
ERROR 1055 (42000) at line 2: Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'run_9oquik1d0bl.h.name' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
But after adding h.name in GROUP BY it worked.
SELECT h.hacker_id, h.name
FROM submissions s
JOIN challenges c
ON s.challenge_id = c.challenge_id
JOIN difficulty d
ON c.difficulty_level = d.difficulty_level
JOIN hackers h
ON s.hacker_id = h.hacker_id
WHERE s.score = d.score
AND c.difficulty_level = d.difficulty_level
GROUP BY h.hacker_id, h.name
HAVING COUNT(s.hacker_id) > 1
ORDER BY COUNT(s.hacker_id) DESC, s.hacker_id ASC;
This did not work for me. Here is the easy solution..
select h.hacker_id,h.name
from hackers h,challenges c ,difficulty d,submissions s
where h.hacker_id=s.hacker_id
and c.challenge_id=s.challenge_id
and c.difficulty_level=d.difficulty_level
and s.score=d.score
group by h.hacker_id,h.name having count(h.hacker_id)>1
order by count(c.challenge_id) desc,h.hacker_id;
SELECT h.hacker_id, h.name
FROM submissions s
JOIN challenges c
ON s.challenge_id = c.challenge_id
JOIN difficulty d
ON c.difficulty_level = d.difficulty_level
JOIN hackers h
ON s.hacker_id = h.hacker_id
WHERE s.score = d.score
GROUP BY h.hacker_id,h.name
HAVING COUNT(s.hacker_id) > 1
ORDER BY COUNT(s.hacker_id) DESC, s.hacker_id ASC
SELECT h.hacker_id, h.name
FROM submissions s
JOIN challenges c
ON s.challenge_id = c.challenge_id
JOIN difficulty d
ON c.difficulty_level = d.difficulty_level
JOIN hackers h
ON s.hacker_id = h.hacker_id
WHERE s.score = d.score
AND c.difficulty_level = d.difficulty_level
GROUP BY h.hacker_id,h.name,s.hacker_id
HAVING COUNT(s.hacker_id) > 1
ORDER BY COUNT(s.hacker_id) DESC, s.hacker_id ASC
" ...
GROUP BY h.hacker_id
HAVING COUNT(s.hacker_id) > 1
ORDER BY COUNT(s.hacker_id) DESC, s.hacker_id ASC "
can some explain me what does this portion of code do? especially I couldn't understand the use of COUNT() function here
Top Competitors
You are viewing a single comment's thread. Return to all comments →
This is a good code to understand use of joins, thanks!
Agreed! Thanks sevenzeni :) Some of your code is redundant btw, here is a cleaned up version:
I came up with the exact solution,except my JOINs are in a different order. Using both my solution and copy pastying this solution I get the same error:
Could someone please shed some light into this? Also, is the "c.difficulty_level = d.difficulty_level" in the WHERE clause neccessary? Isn't that "taken care of" from the JOIN condition?
EDIT: Changing to MySQL resulted in a working code.
yes same for me, also can you please explain why we did s.score=d.score? since by doing that we will be removing some scores like 77, 47 from the submissions table.
Cause it's asking for "full score", those submission without having a full score will not be considered.
s.score is the score of an individual d.score is the FULL score of that difficulty level. As the questions asks about only those who scored Full Marks thus s.score= d.score. It only gives the table of those people who got full scores only.
when you do qroup by you need to pull all the things you select e.g. select h.hacker_id, h.name
your group by would be group by h.hacker_id, h.name
you cannot
select
acolumn
if it is not ingroup by
clause. Therefore you need togroup by
using the bothhackers.hacker_id, name
.But there are different hacker_id with same name. Why would we need to group by with name also?
EDIT:
Is it because we first group by hacker_id and then by hacker name...? So nothing goes out of order...
I don't see a need to group by Name theoretically unless there are multiple names for same Hacker ID. The test fails unless you group by Name. Without seeing what is in the test input, I really can't tell what this is about.
You need to group by name, because most DBMSs require each column in the select clause to also be in the group by clause. And you need the name in the select cause because the problems asks for it.
SELECT h.hacker_id, h.name FROM submissions s JOIN challenges c ON s.challenge_id = c.challenge_id JOIN difficulty d ON c.difficulty_level = d.difficulty_level JOIN hackers h ON s.hacker_id = h.hacker_id WHERE s.score = d.score AND c.difficulty_level = d.difficulty_level GROUP BY h.hacker_id HAVING COUNT(s.hacker_id) > 1 ORDER BY COUNT(s.hacker_id) DESC, h.hacker_id ASC;
This code works perfectly. thank you.
SELECT h.hacker_id, h.name FROM submissions s JOIN challenges c ON s.challenge_id = c.challenge_id JOIN difficulty d ON c.difficulty_level = d.difficulty_level JOIN hackers h ON s.hacker_id = h.hacker_id WHERE s.score = d.score AND c.difficulty_level = d.difficulty_level GROUP BY h.hacker_id,h.name HAVING COUNT(s.hacker_id) > 1 ORDER BY COUNT(s.hacker_id) DESC, h.hacker_id ASC;
this is working after adding h.name for SQL Server
When the order of inner join doesn't matter, I don't understand why my code below doesn't work. Can you help?
SELECT h.hacker_id, h.name FROM hackers h join challenges c on h.hacker_id = c.hacker_id join difficulty d on c.difficulty_level = d.difficulty_level join submissions s on s.hacker_id = h.hacker_id where s.score = d.score and c.difficulty_level = d.difficulty_level group by h.hacker_id, h.name having count(s.hacker_id) > 1 order by count(s.hacker_id) desc, h.hacker_id asc
This is strange. Did you get to know the reason ?
+1, Have the same issue
You cannot join table Hackers and Challeges by challenge_id. These tables cannot be logically joined.
Since the table Challenges only contain the hackers that have "created" the problem you are eleminating so in the final join of all the tables the hackers who did not create any problems will be eleminated.
I hope this explaination helps.
please explain why this condition is necessary in WHERE clause : c.difficulty_level = d.difficulty_level
i'm getting the correct output even after removing this condition.
same to me, that code is not efficient yet, it still makes redundance code.
This one is my code: SELECT s.hacker_id, h.name from submissions s inner join challenges c on s.challenge_id = c.challenge_id inner join difficulty d on c.difficulty_level = d.difficulty_level inner join hackers h on s.hacker_id = h.hacker_id where s.score = d.score group by s.hacker_id, h.name having count(s.hacker_id) > 1 order by count(s.hacker_id)desc, s.hacker_id
Why should we use where c.difficulty_level = d.difficulty_level as we have used it already to get the join on c.difficulty_level = d.difficulty_level ?
why do we need join with challenges table?
@gustav_danell
After last order by instead of using
ORDER BY COUNT(s.hacker_id) DESC, s.hacker_id ASC
use
ORDER BY COUNT(h.hacker_id) DESC, h.hacker_id ASC
hi,
Try
order by count(h.hacker_id) desc, h.hacker_id asc in your last line.
Just change the h.hacker_id to s.hacker_id in the Select Statement and Group By clause. It'll solve the error.
select h.hacker_id,h.name from submissions s join challenges c on s.challenge_id = c.challenge_id join difficulty d on c.difficulty_level = d.difficulty_level join hackers h on s.hacker_id = h.hacker_id where s.score = d.score group by h.hacker_id,h.name having count(s.hacker_id)>1 order by count(h.hacker_id) desc,h.hacker_id asc;
this will run in ms sql server
you can add name field in group by like this:
group by h.hacker_id, h.name
ADD This : GROUP BY s.hacker_id, h.name instead of GROUP BY h.hacker_id
SELECT SUBMISSIONS.hacker_id, HACKERS.NAME
FROM SUBMISSIONS JOIN CHALLENGES ON SUBMISSIONS.CHALLENGE_ID = CHALLENGES.CHALLENGE_ID JOIN DIFFICULTY ON DIFFICULTY.DIFFICULTY_LEVEL = CHALLENGES.DIFFICULTY_LEVEL JOIN HACKERS ON HACKERS.HACKER_ID = SUBMISSIONS.HACKER_ID
WHERE SUBMISSIONS.score = DIFFICULTY.score GROUP BY SUBMISSIONS.hacker_id, HACKERS.NAME
HAVING COUNT(SUBMISSIONS.hacker_id) > 1 ORDER BY COUNT(SUBMISSIONS.hacker_id) DESC, SUBMISSIONS.hacker_id ASC
u might have used MS Sql
try below modified code for MS Sql server
select temp.hack, temp.name from (select c.hacker_id as hack, h.name as name, count(c.challenge_id) as chid from challenges c join difficulty d on c.difficulty_level = d.difficulty_level join submissions s on c.challenge_id = s.challenge_id and c.hacker_id = s.hacker_id join hackers h on s.hacker_id = h.hacker_id where d.score = s.score group by c.hacker_id, h.name) temp order by temp.chid desc, 1 asc
Set a variable with name sql_mode to empty at the top.
SET sql_mode="";
this should work for you.
line is redundant. You can remove it
agreed. given that soooooo many people had this redundant line in their codes (which are posted here), wonder if they borrowed the code from the top comment and reposted as if it's their own version. the redundancy is kinda obvious, even for a novice, so it's kinda surprising that soooo many people made the same mistake.
why do we use s.score = d.score?
Because they have asked for a list of students who got full marks. "d.score" indicates full marks and "s.score" indicates marks of the submission.
Why is it redundant? I think it's required.
It is not required because a particular challenge_id will correspond to one and only one difficulty level and hence, one and only one maximum score.
Redundant because they're doing a join with the
difficulty_level
earlier in the solution so using it in thewhere
statement isn't required since it's already appliedi dont think its redundant. someone can attempt a 60 marks question of difdiculty 2 and get 30 marks which might correlate to 30 marks of level 1.
Correct, one can attampt challenege having difficulty level 4 (of score 60) and might get only 30, so it will confuse whether this 30 belongs to full score of defficulty level 2 or partial score of some other difficulty level.
I think it's redundant because we already mentioned it in the logic we used to make a master table using join.
NO this line is not redudant because imagine you have the same total score in difficulty table then it would create a conflict so its neccesary to match diff.level also bro
isnt the diff.level already matched when join with c.difficulty_level = d.difficulty_level? why need to match again
agree! already joined on diffculty_level
came here to comment the same! lol
can someone explain why use GROUP BY? Thanks
Most, if not all, hackers in the submissions table have multiple entries. GROUP BY allows you to COUNT the entries that meet the WHERE conditions.
if not using group by clause, then having clause will not work
Having without Group By . Actually it does but returns 1 or 0 based on the system you run it on, that a having with a group by is like group by () which is implicit.
4 years too late but group by to aggregate the hacker_id, so you can then further filter using HAVING
we use group by so that we can count the number of challenges a hacker has gained full marks in
can someone explain why group by two items, i think just hacker_id is enough. thanks
I thought that too, but it doesn´t work with only hacker_id.
When you are selecting two columns in joins,and using Group By then it's necessary to use all the columns which you have taken in select query. Thanks
Because, in group-by queries, all the columns you select must be in the group by clause or they must be aggregations.
I think, since we are using having function (doing a count), we need to have a group by to aggregate.
Could be wrong, but I think if we don't use GROUP BY then there would likely be repeated entries for hackers who made top scores on more than one challenge
All you did ARMaher was take of his 'inner' wording
Edlaierii, can you please tell why this code does not wrok for Oracle.
And removed a redundant "and" clause.
I think we can put the where condition into the join on clauses
The ability to do something doesn't it make it the best or right way to do it though. The 'on' clauses are what we base our table joins on. The where clause is a filter on the joined tables and as such should be the proper place for conditions set on the resultSet.
i guess it would work without this condition as well since the join has already been created under this condition.
c.difficulty_level = d.difficulty_level(in the where clause)
I‘m sorry, this code will have an error since it only groups by the h.hacker_id, you should also group by name. Best,
Hi! Thanks for sharing the solution. I am a beginner at learning SQL. I can't understand why we are selecting from submissions instead of Hackers. Can anybody guide me here? Thanks
You can select from hacker too:
select s.hacker_id, h.name from hackers as h inner join submissions as s on h.hacker_id = s.hacker_id inner join challenges as c on s.challenge_id = c.challenge_id inner join difficulty as d on c.difficulty_level = d.difficulty_level where s.score = d.score and c.difficulty_level = d.difficulty_level group by h.hacker_id, h.name having count(h.hacker_id) > 1 order by count(h.hacker_id) desc, h.hacker_id;
Could anyone exlain why use h.hacker_id instead of Hackers.hacker_id? If you did not show that Hackers as h, how could you use h. to represents Table Hackers?
What's wrong with this code? INNER JOIN ,being commutative, should work even if the order of the join statements is altered.,Shouldn't it?
select s.hacker_id,h.name from hackers as h inner join challenges as c on h.hacker_id=c.hacker_id
inner join difficulty as d on d.difficulty_level=c.difficulty_level inner join submissions as s on s.challenge_id=c.challenge_id where d.score=s.score and c.difficulty_level= group by s.hacker_id,h.name having count(s.hacker_id)>1 order by count(s.hacker_id) desc,s.hacker_id;
Challenges table does not have all the hacker_id entries. Submissions table and Hacker table does. So if we join tables based on h.hacker_id=c.hacker_id then we are missing out on hacker_ids that are not present in Challenges table.
Trying joining Challenges table to Submissions table using challenge_id. It worked for me.
Also, once you have joined the tables it doesnt really matter if you use h.hacker_id or c.hacker_id or s.hacker_id in your ORDER BY clause. It will all work out the same. Atleast for hackerrank it does.
See below:
SELECT h.hacker_id, h.name FROM SUBMISSIONS s JOIN HACKERS h ON s.hacker_id = h.hacker_id JOIN CHALLENGES c ON s.challenge_id = c.challenge_id JOIN DIFFICULTY d ON c.difficulty_level = d.difficulty_level WHERE s.score = d.score GROUP BY h.hacker_id, h.name HAVING COUNT(h.hacker_id) > 1 ORDER BY COUNT(h.hacker_id) DESC, h.hacker_id
'Challenges table does not have all the hacker_id entries. Submissions table and Hacker table does.'
Is this stated in the question, or is it inferred?
This is a catch, which got me first. We should ignore hacker_id in challenges table because of its meaing: it's the id of the hacker, who created the challenge, not who took the challege. It's a reference table, and it's important to understand its grain/primary key. So, if you use this hacker_id for joins, you'll end up having only a handful of hackers who created challenges.
The last order by should be
and not
select h.name,s.hacker_id,s.score from Submissions s join Challenges c on s.challenge_id = c.challenge_id join Difficulty d on d.difficulty_level = c.difficulty_level join Hackers h on h.hacker_id = s.hacker_id where d.score = s.score group by s.hacker_id,s.score,h.name having count(s.hacker_id) > 1 order by s.score desc,s.hacker_id asc;
select z.hacker_id,z.name from (select h.name,h.hacker_id from Submissions s inner join Challenges c on s.challenge_id = c.challenge_id inner join Difficulty d on d.difficulty_level = c.difficulty_level inner join Hackers h on h.hacker_id = s.hacker_id and d.score = s.score group by h.hacker_id,h.name having count(h.hacker_id) > 1 order by h.hacker_id asc)z;
why do you pick select h.hacker_id, h.name from submission?
he did not. there are 5 tables joined together, the id and name are from table h. it's just that the first table in the chain of joins happens to be submission, but it doesnt me that he picked the two columns from submission. the table where id and name are from is dictated by "h.", but not the first table in the joins.
I have almost the same code as yours except that I join submissions and challenges tables on two coditions: s.challenge_id = c.challenge_id and a.hacker_id = c.hacker_id. I did so because chanllenges table does not suggest difficulty_level is related to chanllenge_id only, it might be related to tracker_id as it shows in sample table. I think my code is more conservative in this sense. However, it gives wrong output. The question needs to either remove tracker_id column from challenge table or specify challenge_id is unique key or make the real challenge table complete with all tracker_id conditions.
Agreed!
having count(*)>1 what does it stand for? I dont know why we use having count here. Pls help
That will display on the unique records.
THE ABOVE CODE DOESNT WORK IN MYSQL IT GIVES AN ERROR
Hi, I have a question Does order of joining table matters?
same question. Theoretically inner join doesn't discriminate joined table orders. But it produce wrong asnwer in this case. Here's my code. Anyone can help to shed some light on this question?
why are we imposing the condition c.difficulty_level = d.difficulty_level in the where clause again? Is this not redundant? As we are already using this condition in our join. And can someone explain why the order of joins important? because I tried with exact same solution but different order of joins and I got a wrong answer.
Yes, you're right that it is redundant.
The order of joins is flexible, but some of the ON id-matching constraints refer to previous joins. So if you're referring to any of them then they do have to be in the order that they're declared/called.
Here is an alternate way to do this sort of joining. It makes more sense to me and eliminates a bit of the INNER JOIN repetition.
Dude even the below code is also correct
select hacker_id,name from hackers h where h.hacker_id in(select hacker_id from challenges where score in(select max(score) from submissions))group by h.hacker_id,having count(h.hacker_id)>1 order by h.hacker_id asc;
but generating errors
Could you please explain the
having count(h.hacker_id)>1
execution? I thought that having clauses get evaluated after the grouping is done. And if you group on hacker_id, name, won't you simply end up with a table filled with only distinct hacker_id's and names? Performing counts on this should then result in a single count for each hacker_id/name. Where am I getting confused? Is it because of the aggregate count function in the order clause that changes the intermediate evaluation? Thanksmax score doesnt necessarily equal to full score. edit: plus, you are only ordering by hacker id. there should be two conditions on orders.
AND c.difficulty_level = d.difficulty_level
this condition check is not needed as challenge and difficulty tables are being joined on this condition only.
Great code, thank you! One thing to note is that if you change WHERE to AND in
and move it up right after the second JOIN...ON..., it should run faster, as it would use the condition in a join, not after all the joins. This is my version:
why do we use having count(h.hacker_id)>1 here? pls help
This COde doesn't work and throws error.
sevenzeni's CODE WORKS!!!!
I have submitted this code select h.hacker_id, h.name from submissions s inner join challenges c on s.challenge_id = c.challenge_id inner join difficulty d on c.difficulty_level = d.difficulty_level inner join hackers h on s.hacker_id = h.hacker_id where s.score = d.score and c.difficulty_level = d.difficulty_level group by h.hacker_id, h.name having count(s.hacker_id) > 1 order by count(s.hacker_id) desc, s.hacker_id asc I got the result. I cannot explain query because I have submiited this queries 2 years ago. Sorry for explainaton. I hope you understand.
You removed one single parameter, and ironically without it made the query fail lmao
and what if one hacker has more than one submission to the same challenge with the max score? this case doesn't happen in the tests, but I think the text is not clear with that.
Can someone explain why we need WHERE s.score = d.score AND c.difficulty_level = d.difficulty_level ?
You don't, it's redundant. A few people raised this issue in other comments, did you check there first?
Why do we we test the difficulty level it is a joining parameter should be equal anyway?
Your code throws an error, as h.name is not functionally dependent on the columns in the group by clause (h.name is not included in the group by clause) and to remove the redundancy just remove this condition, c.difficulty_level = d.difficulty_level. That should work. Here is my code, that worked fine for me,
SELECT H.HACKER_ID, H.NAME FROM SUBMISSIONS S INNER JOIN CHALLENGES C ON S.CHALLENGE_ID = C.CHALLENGE_ID INNER JOIN DIFFICULTY D ON C.DIFFICULTY_LEVEL = D.DIFFICULTY_LEVEL INNER JOIN HACKERS H ON S.HACKER_ID = H.HACKER_ID
WHERE S.SCORE = D.SCORE GROUP BY H.HACKER_ID, H.NAME HAVING COUNT(H.HACKER_ID) > 1 ORDER BY COUNT(H.HACKER_ID) DESC, S.HACKER_ID ASC
Please what is the use of " WHERE s.score = d.score AND c.difficulty_level = d.difficulty_level " in the code?
ERROR OCUURS IN MYSQL : " ERROR 1055 (42000) at line 1: Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'run_88k14m9gmm5.h.name' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by "
group by needs h.name too
In where clause, is this not redundant "AND c.difficulty_level = d.difficulty_level" as this is already covered by joins as "ON c.difficulty_level = d.difficulty_level "
Aren't we looking for the highest number of score with max number of challenges then why aren't we using MAX(score) and count (challenge_id)???
group by h.hacker_id, h.name having count(s.hacker_id) > 1
Speaking of redundant code: you don't need this check in WHERE clause:
AND c.difficulty_level = d.difficulty_level
the second where is redundant, as it's the same condition on which the table c and d are joined.
Can someone tell me why are s.challenge_id = c.challenge_id and c.difficulty_level = d.difficulty_level equated? Why are we joining on basis of these two conditions?
I still have to add h.name into GROUP BY. I reaally don't know why I need GROUP BY h.name
I think its the error of HackerRank In mysql, it's okay not to use aggregation function in select clause when the field is in group by clause.
but in this system, not using aggreation function caused error.
here is my solution
SELECT h.hacker_id, max(h.name) FROM submissions s JOIN challenges c ON s.challenge_id = c.challenge_id JOIN difficulty d ON c.difficulty_level = d.difficulty_level JOIN hackers h ON s.hacker_id = h.hacker_id WHERE s.score = d.score AND c.difficulty_level = d.difficulty_level GROUP BY h.hacker_id HAVING COUNT(h.hacker_id) > 1 ORDER BY COUNT(h.hacker_id) DESC, h.hacker_id ASC
why need the c.difficulty_level = d.difficulty_level? since already have join table c and d on difficulty_level?
Can someone please tell me why my join is not working?
select hackers.hacker_id,hackers.name from hackers INNER JOIN submissions on hackers.hacker_id=submissions.hacker_id INNER JOIN difficulty on submissions.score=difficulty.score INNER JOIN challenges on difficulty.difficulty_level=challenges.difficulty_level where submissions.score = difficulty.score and challenges.difficulty_level = difficulty.difficulty_level group by hackers.hacker_id, hackers.name having count(submissions.hacker_id) > 1 order by count(submissions.hacker_id) desc, submissions.hacker_id asc
I believe that "group by h.name" is not redundant, otherwise an error will appear.
If you try to join top to bottom, vs bottom to top, which is what you have done, why does that not work? i.e. join a. hackers to challenges b. challenges to difficulty c. challenges to submissions?
You need to add
h.name
to theGROUP BY
statement.in count why only hacker_id why not submission_id
I tried the same thing by removing group by h.name, and it runs into errors. What's wrong?
in group by there is a problem. to remove it on the line group by write GROUP BY h.hacker_id,h.name then after it work in mysql. because with group by clause we can use aggregation function and the attribute by which we want to group the data
This was giving following error:
But after adding h.name in GROUP BY it worked.
This was giving following error:
But after adding h.name in GROUP BY it worked.
do group by both h.hacker_id and h.hacker_name
how about the full score condition? why does s.score = d.score gives the full score?
@ATMaher
This did not work for me. Here is the easy solution..
select h.hacker_id,h.name from hackers h,challenges c ,difficulty d,submissions s where h.hacker_id=s.hacker_id and c.challenge_id=s.challenge_id and c.difficulty_level=d.difficulty_level and s.score=d.score group by h.hacker_id,h.name having count(h.hacker_id)>1 order by count(c.challenge_id) desc,h.hacker_id;
GROUP BY h.hacker_id, h.name
h.name is missing
SELECT h.hacker_id, h.name FROM submissions s JOIN challenges c ON s.challenge_id = c.challenge_id JOIN difficulty d ON c.difficulty_level = d.difficulty_level JOIN hackers h ON s.hacker_id = h.hacker_id WHERE s.score = d.score
GROUP BY h.hacker_id,h.name HAVING COUNT(s.hacker_id) > 1 ORDER BY COUNT(s.hacker_id) DESC, s.hacker_id ASC
whats the use of
I think that is also redundant while the works get done by
You forgot to add group by on name also
GROUP BY h.hacker_id, h.name
--IN MSSQL
SELECT h.hacker_id, h.name FROM submissions s JOIN challenges c ON s.challenge_id = c.challenge_id JOIN difficulty d ON c.difficulty_level = d.difficulty_level JOIN hackers h ON s.hacker_id = h.hacker_id WHERE s.score = d.score AND c.difficulty_level = d.difficulty_level GROUP BY h.hacker_id,h.name,s.hacker_id HAVING COUNT(s.hacker_id) > 1 ORDER BY COUNT(s.hacker_id) DESC, s.hacker_id ASC
I am little confuse can you explain me this code. How can we use group by when we don't have ant aggregation to perform?
" ... GROUP BY h.hacker_id HAVING COUNT(s.hacker_id) > 1 ORDER BY COUNT(s.hacker_id) DESC, s.hacker_id ASC " can some explain me what does this portion of code do? especially I couldn't understand the use of COUNT() function here