We use cookies to ensure you have the best browsing experience on our website. Please read our cookie policy for more information about how we use cookies.
Took me forever to figure this out. It's a really cool problem - the breakthrough came when you realize that, since sequentiality is important, then the array should be walked in reverse because the only values you want to check are those at higher indeces than the one you are checking.
The other big thing to realize is that you're checking for two things, not just one - you're checking for the existence of a value r times the current value, or your checking for a pair that exists for the for the value that begin at r times the current value and also contain r times that value.
For example: Let's say r is 3. If you have a the value you're checking while iterating the array that is also 3, and there's a 9 and a 27 in the dictionary that you've been populating, you should have populated another dictionary and entry for the pair of the values (9, 27) because those are a pair where 27 is r times 9 and if we find a 3 at a lower index, then we know we have a triplet.
So the algorithm is:
-- keep two dictionaries:
1. a dictionary to store the number of times each single value that is repeated in the array
2. a dictionary to store any pair of values that are i and (i * r) (using i as the key)
-- Walk the array backwards
1. if the pair dictionary has a value for r times the one you're checking, then you add the number of pairs to the overall count.
2. otherwise, if there's add a new pair and add it to the pair dictionary if there's a value r times the one you're checking in the single value dictionary.
3. otherwise, just add the value to the single value dictionary.
And that will do it.
In Python, O(N) time complexity:
def countTriplets(arr, r):
count = 0
dict = {}
dictPairs = {}
for i in reversed(arr):
if i*r in dictPairs:
count += dictPairs[i*r]
if i*r in dict:
dictPairs[i] = dictPairs.get(i, 0) + dict[i*r]
dict[i] = dict.get(i, 0) + 1
return count
This solution, like many others, claims to be O(n) because there is only one 'for' loop involved.
But that idea is totally flawed because, the access in dictionary (or STL maps in C++)is O(log(n)).
This means the overall worst case complexity when all the elements are distinct, is O(nlog(n)).
You need to understand how a hashmap internally works, especially in case of collisions.
In most cases, the hashmap maps to a list (or another hashmap!) of values that fall under the same keys. Collisions are rare for small data-sets. And in an EXTREAMLY RARE case, where ALL your values fall under the same key, will the hashmap map every item to the same key, and thus in a list implementation of collision handling, it would be O(N).
Hope that helps. You can read more about dictionaries/hashmap to understand better :)
Any hash function's efficiency is restricted by the memory available to implement it. The term "small data set" itself is relative to amount of memory available to you. The only way you can ensure that the collisions are "extremely rare" is by knowing your hardware supports that kind of memory. You might not have explicitly mentioned that thing, but that's what it boils down to.
I recommend reading up on Universal Hash families. might answer some of your questions.
Definitely anything higher than a G should be plenty to handle all the testcases.
You might be able to prove that after reading on universal hash families.
A very nice read indeed. I went through a few more lectures to make myself clear on this topic, especially because I seem to have received a lot of downvotes on my comment. https://wiki.python.org/moin/TimeComplexity
So to clear my side of the argument, I would like you to go through the above link where they mention the average case being O(1) and (amortized) worst case for dictionary access being O(n). https://stackoverflow.com/questions/2771368/can-hash-tables-really-be-o1
Also, the accepted answer in the above link mentions the exact reasons for this worst case i.e. failing assumptions.
The worst scenario depends on the hashing function.
Since the keys here are integers, O(1) dict access is guaranteed by the underlying Python hash function.
O(1) is not "guaranteed". O(1) is the expected average case (and I'd suggest quite optimistic for an "average" case).
Given that hashes (or dicts) are usually based on RB trees, O(1) is a "luck hit" (supposing caching of subtrees in the RB tree). Sometimes when the hash key is an integer a form of (RB balanced) trie is used (depends on implementation).
In most cases, using an int hash, results will be at or below O(log N), specially when in this case we are talking of unique keys (story is totally different if duplicate keys are allowed - worst case O(N)).
If you're lucky, you'll get a hit on the first bit of the RB traversal (and match not found and caching helping), it's basically O(1). If you have a positive hit on the 1st bit (and multiple subtrees), you need to traverse down (and that ends up approaching O(logN)).
I don't know the specifics of the Python implementations, but I am not aware that anything better than RB trees are in use.
Python dicts, however, are not based on RB trees but on hash tables with open addressing collision resolution. The probing method in use is double hashing (details here).
You can lookup the implementation of Python's hash function here. _PyHASH_BITS being defined as 31 for 32bit platforms and 61 for 64bit platforms, all integers dealt with in the current problem have different hash values.
Can you provide a reference for O(log(n)) access time for STL unordered_map?
I believe in general unordered_map search, insert, and delete are all O(1) in the average case and O(n) in the worst case, which should be quite rare for this problem (in which we are hashing numbers).
From what I understand, the dictPairs are the pairs of values higher than the current index you're looking at. So dictPairs.get(i, 0) is looking to see if a pair already exists if not it's setting the frequency to 0. dict[ir] is getting the frequency of the higher of the pair and adding it to the frequency of the current pair. So if the current i is the first time you've encountered the number and the i times r has been seen you want to se the number of pairs to the current number of pairs added to the frequency of the higher number in the pair. Let's say you've seen 2 twice, 4 once, and haven't seen 1 then the number of pairs should be 2. (2(1), 4(1)) and (2(2), 4(1)). Meaning that when you see 1 the first time you'd want to increment the count which is done in (if i*r in dictPairs) you'd also add 1 to the dictPairs with 1 as the key so that would be 2 (the number of times you've seen 2) with the pairs (1(1), 2(1)) and (1(1), 2(2)). And so the next time you see 1, you'd increment the count again and the dictPairs[1] would increment by 2 so the total would be 4 with the pairs represented being (1(1), 2(1)), (1(1), 2(2)), (1(2), 2(1)), and (1(2), 2(2)).
The most important advantage is lack of if else's which can easily be forgotten in corner cases.
Without good test coverage almost undetected till that very rare bug appears in prod😉
defcountTriplets(arr,r):a=0cpairs=dict()cnumber=dict()foriinreversed(arr):# keys of cpairs = values in arr, corresponding values of cpairs = number of possible pairs with the number <key's number> whereby the 2nd element of each pair = <key's number>*r, and s.t. the keys occur after i in the list arr.# values of cnumber = the number of times that <key's number> has been encountered so far when looking at arr in reverse.ifi*rincpairs:a+=cpairs[i*r]ifnot(iincnumber):cnumber[i]=0ifi*rincnumber:ifiincpairs:cpairs[i]+=cnumber[i*r]else:cpairs[i]=cnumber[i*r]# Necessary to put this line right at the end of the loop to properly handle the edge case r==1cnumber[i]+=1returna
This is based on the idea that, as we move through the array from the last to the first element, we use a dictionary keep track of the 'running' number of times each number has occurred.
We use a second dictionary to keep track of the 'running' number of times each pair of the form (x,rx) has occurred, where x is a number in the array.
If we encounter a number, say k, such that r*(k) = a number we encountered at least once before, then the number of additional pairs (k,rk)) that can be formed with that number increases by the number of times rk has been encountered before.
Finally, for each number y, we add to the count the number of pairs of the form (yr,yr^2) that have occurred so far, and return that count after scanning through the entire array from back to front.
We could reverse-engineer this idea to implement a version where we start scanning through the array from the front to back (i.e. the other way around).
If there is anything code-wise that you don't understand, please specify the exact parts that require clarification.
I'm not 100% I understand everything that is going on when r=1 versus my code which got stuck on it. But for what it's worth, I found my translation of the code above a bit easier to consume for non-Python speakers. Works just as well.
C# ver:
long triplets = 0;
Dictionary<long, long> dict = new Dictionary<long, long>();
Dictionary<long, long> dictPairs = new Dictionary<long, long>();
for(int index = arr.Count-1; index >= 0; index--)
{
if (dictPairs.ContainsKey(arr[index]*r))
triplets += dictPairs[arr[index]*r];
if (dict.ContainsKey(arr[index]*r))
{
if(dictPairs.ContainsKey(arr[index]))
dictPairs[arr[index]] = dictPairs[arr[index]] + dict[arr[index]*r];
else
dictPairs[arr[index]] = dict[arr[index]*r];
}
if(!dict.ContainsKey(arr[index]))
dict.Add(arr[index], 1);
else
dict[arr[index]]++;
}
return triplets;
@manndras
I highly recommend that you read the editorial!
You seem like the guy who would enjoy it.
The left map/right map solution he mentions is O(n) (think there's a typo since the poster called it nlgn).
Anyways, its actually even simpler than what you have (not to say that your explanations weren't helpful; they were!).
It was genius to iterate inputs in reverse order, so checking the next sequence can be done by multiplication, which lets the algorithm not need to check if the next sequence is still integer from dividing an input with r.
And I was using one map with int[count, countOfNextSequence] as value, but it seems using two maps are clearer than my approach.
And I can't believe this is a medium level problem XD lol I'm dying.
This was really helpful but took me a while to understand what was going on. In the end tabulating the algorithm helped and I hope this saves some people time
This solution is extremely clever — or at least, it took me a while to understand it. :)
In case it is of use to anyone else, I rewrote the code with inline comments and descriptive variable names, which helped me to get it. I refer to a harmonic pair as two elements where the leftmost element is the base and the rightmost element is the head, and the the head = r * base.
defcountTriplets(arr,r):# running count of tripletstriplet_count=0# maps element val to count of elements with that value, to our rightval_to_count={}# maps element val to count of harmonic pairs with that element # as the base, where the head of the pair is to our rightbaseval_to_paircount={}# iterate in reverse, so that when we check if an element# is the base of a harmonic pair, we can be sure we have already seen# any potential head of the pair, since it must be to our right,# since we only consider triplets where i < j < k so we only consider# pairs where i < jforbaseinreversed(arr):head=base*r# 1. determine if this pair's head is the base of a pair to the right## In other words, check if this is the base of a triplet## if this element's head is another pair's base ...ifheadinbaseval_to_paircount:# ... then we found a triplet.# increase the triplet triplet_count, by the number of# pairs usig our head as their basetriplet_count+=baseval_to_paircount[head]# 2. update triplet_count of pairs with this base value## (We must do this second, so as to not over-increment# triplet_coont, by adding ourselves to baseval_to_paircount# when base=head.)## if this element's head has appeared to the right in the array ...ifheadinval_to_count:# ... then increase the triplet_count of pairs starting at our current base# by the number of times head has appeared so farbaseval_to_paircount[base]=baseval_to_paircount.get(base,0)+val_to_count[head]# 3. Note this element value appeared, for future iterations# examining elements to the left## (We must do this last, so we don't over-increment# baseval_to_paircount by coounting ourselves as our own head# when base=head.)#val_to_count[base]=val_to_count.get(base,0)+1returntriplet_count
I followed your approach to answer in Java but it doesn't work for the test case 3, with 100 000 times the same number and r = 1. I guess it's an edge case but I can't figure out the expected behavior when r = 1.
static long countTriplets(List<Long> arr, long r) {
long count = 0;
Map<Long, Integer> dict = new HashMap<>();
Map<Long, Integer> pairs = new HashMap<>();
for (int i = arr.size() - 1; i >= 0; --i) {
long current = arr.get(i);
long multiple = current * r;
if (pairs.containsKey(multiple)) {
count += pairs.get(multiple);
}
if (dict.containsKey(multiple)) {
int existingPairs = pairs.getOrDefault(current, 0);
int newPairs = dict.get(multiple);
pairs.put(current, existingPairs + newPairs);
}
int val1 = dict.getOrDefault(current, 0) + 1;
dict.put(current, val1);
}
return count;
}
it's a brilliant solution but I don't understand why you say that it needs to be walked in reverse.
You can do exactly the same in the opposite direction, just look for i/r rather than i * r, or am I missing something?
Count Triplets
You are viewing a single comment's thread. Return to all comments →
Took me forever to figure this out. It's a really cool problem - the breakthrough came when you realize that, since sequentiality is important, then the array should be walked in reverse because the only values you want to check are those at higher indeces than the one you are checking.
The other big thing to realize is that you're checking for two things, not just one - you're checking for the existence of a value r times the current value, or your checking for a pair that exists for the for the value that begin at r times the current value and also contain r times that value.
For example: Let's say r is 3. If you have a the value you're checking while iterating the array that is also 3, and there's a 9 and a 27 in the dictionary that you've been populating, you should have populated another dictionary and entry for the pair of the values (9, 27) because those are a pair where 27 is r times 9 and if we find a 3 at a lower index, then we know we have a triplet.
So the algorithm is: -- keep two dictionaries: 1. a dictionary to store the number of times each single value that is repeated in the array 2. a dictionary to store any pair of values that are i and (i * r) (using i as the key)
-- Walk the array backwards 1. if the pair dictionary has a value for r times the one you're checking, then you add the number of pairs to the overall count. 2. otherwise, if there's add a new pair and add it to the pair dictionary if there's a value r times the one you're checking in the single value dictionary. 3. otherwise, just add the value to the single value dictionary.
And that will do it.
In Python, O(N) time complexity:
Interesting approach. Different to mine but definitely interesting.
Please share your approach
Search function is your assistant here. Already posted on the comments.
it's actually similar as mine, but from different direction. you solution didn't need to handle the case arr[i] % r != 0, which I fell in for a while.
Bro can you show your code?
Here i have explained in a very simple way. https://youtu.be/KZ8k9-22JmQ
@sattujaiswal
the explanation is super.
highly recommended
This solution, like many others, claims to be O(n) because there is only one 'for' loop involved. But that idea is totally flawed because, the access in dictionary (or STL maps in C++)is O(log(n)). This means the overall worst case complexity when all the elements are distinct, is O(nlog(n)).
Dictionary access for all operators used here is O(1), not O(log n). https://www.ics.uci.edu/~pattis/ICS-33/lectures/complexitypython.txt
I didn't know this. Thank you. But here is something I read.
https://stackoverflow.com/a/1963514
So, in a worst case scenario, the operation may be O(n). This makes the python implementation solution O(n^2).
You need to understand how a hashmap internally works, especially in case of collisions.
In most cases, the hashmap maps to a list (or another hashmap!) of values that fall under the same keys. Collisions are rare for small data-sets. And in an EXTREAMLY RARE case, where ALL your values fall under the same key, will the hashmap map every item to the same key, and thus in a list implementation of collision handling, it would be O(N).
Hope that helps. You can read more about dictionaries/hashmap to understand better :)
What I understand from your comment is that we are depending on the hardware (memory) provided to calculate the complexity of an algorithm. I think that is quite incorrect. Making assumptions like "virtually infinite memory" can lead to a lot of errors. And yes i agree about the "extremely rare" part and therefore accept that the average case complexity comes out to be Ω(1)
Uhh.. I never mentioned anything about hardware dependencies? Read again.
Any hash function's efficiency is restricted by the memory available to implement it. The term "small data set" itself is relative to amount of memory available to you. The only way you can ensure that the collisions are "extremely rare" is by knowing your hardware supports that kind of memory. You might not have explicitly mentioned that thing, but that's what it boils down to.
I recommend reading up on Universal Hash families. might answer some of your questions. Definitely anything higher than a G should be plenty to handle all the testcases. You might be able to prove that after reading on universal hash families.
https://www.cs.cmu.edu/~avrim/451f11/lectures/lect1004.pdf
A very nice read indeed. I went through a few more lectures to make myself clear on this topic, especially because I seem to have received a lot of downvotes on my comment.
https://wiki.python.org/moin/TimeComplexity
So to clear my side of the argument, I would like you to go through the above link where they mention the average case being O(1) and (amortized) worst case for dictionary access being O(n).
https://stackoverflow.com/questions/2771368/can-hash-tables-really-be-o1
Also, the accepted answer in the above link mentions the exact reasons for this worst case i.e. failing assumptions.
The worst scenario depends on the hashing function. Since the keys here are integers, O(1) dict access is guaranteed by the underlying Python hash function.
O(1) is not "guaranteed". O(1) is the expected average case (and I'd suggest quite optimistic for an "average" case). Given that hashes (or dicts) are usually based on RB trees, O(1) is a "luck hit" (supposing caching of subtrees in the RB tree). Sometimes when the hash key is an integer a form of (RB balanced) trie is used (depends on implementation). In most cases, using an int hash, results will be at or below O(log N), specially when in this case we are talking of unique keys (story is totally different if duplicate keys are allowed - worst case O(N)). If you're lucky, you'll get a hit on the first bit of the RB traversal (and match not found and caching helping), it's basically O(1). If you have a positive hit on the 1st bit (and multiple subtrees), you need to traverse down (and that ends up approaching O(logN)). I don't know the specifics of the Python implementations, but I am not aware that anything better than RB trees are in use.
You're right about O(1) being the average case.
Python dicts, however, are not based on RB trees but on hash tables with open addressing collision resolution. The probing method in use is double hashing (details here).
You can lookup the implementation of Python's hash function here. _PyHASH_BITS being defined as 31 for 32bit platforms and 61 for 64bit platforms, all integers dealt with in the current problem have different hash values.
Having read the links, O(N) still seems to be "best optimal case". O (logN) still would appear to be "standard expected case".
However will certainly be willing to concede if any implementation of a HashMap can guarantee O(N).
Technically I think it's unfeasable (though previous attempts have tried so).
Holding my "bated" breath.
Can you provide a reference for O(log(n)) access time for STL unordered_map? I believe in general unordered_map search, insert, and delete are all O(1) in the average case and O(n) in the worst case, which should be quite rare for this problem (in which we are hashing numbers).
hi, i am trying really hard but I can't wrap my head around this:
Can you please explain how you are able to determine the final count of triplets using this? thanks
From what I understand, the dictPairs are the pairs of values higher than the current index you're looking at. So dictPairs.get(i, 0) is looking to see if a pair already exists if not it's setting the frequency to 0. dict[ir] is getting the frequency of the higher of the pair and adding it to the frequency of the current pair. So if the current i is the first time you've encountered the number and the i times r has been seen you want to se the number of pairs to the current number of pairs added to the frequency of the higher number in the pair. Let's say you've seen 2 twice, 4 once, and haven't seen 1 then the number of pairs should be 2. (2(1), 4(1)) and (2(2), 4(1)). Meaning that when you see 1 the first time you'd want to increment the count which is done in (if i*r in dictPairs) you'd also add 1 to the dictPairs with 1 as the key so that would be 2 (the number of times you've seen 2) with the pairs (1(1), 2(1)) and (1(1), 2(2)). And so the next time you see 1, you'd increment the count again and the dictPairs[1] would increment by 2 so the total would be 4 with the pairs represented being (1(1), 2(1)), (1(1), 2(2)), (1(2), 2(1)), and (1(2), 2(2)).
Thanks so much. This really helped
incredible, great solution
Nice... with Counter looks almost magic:
nice solution, for example in C# there is no such thing as a specialized counter dictionary, in this case it is very useful (I think)
The most important advantage is lack of if else's which can easily be forgotten in corner cases. Without good test coverage almost undetected till that very rare bug appears in prod😉
yes very nice also it improves readability
Alternative without using the collections module:
can somebody explain me this code i am new to python
This is based on the idea that, as we move through the array from the last to the first element, we use a dictionary keep track of the 'running' number of times each number has occurred.
We use a second dictionary to keep track of the 'running' number of times each pair of the form (x,rx) has occurred, where x is a number in the array. If we encounter a number, say k, such that r*(k) = a number we encountered at least once before, then the number of additional pairs (k,rk)) that can be formed with that number increases by the number of times rk has been encountered before.
Finally, for each number y, we add to the count the number of pairs of the form (yr,yr^2) that have occurred so far, and return that count after scanning through the entire array from back to front.
We could reverse-engineer this idea to implement a version where we start scanning through the array from the front to back (i.e. the other way around).
If there is anything code-wise that you don't understand, please specify the exact parts that require clarification.
I'm not 100% I understand everything that is going on when r=1 versus my code which got stuck on it. But for what it's worth, I found my translation of the code above a bit easier to consume for non-Python speakers. Works just as well.
C# ver:
Thanks
@manndras I highly recommend that you read the editorial! You seem like the guy who would enjoy it. The left map/right map solution he mentions is O(n) (think there's a typo since the poster called it nlgn). Anyways, its actually even simpler than what you have (not to say that your explanations weren't helpful; they were!).
I did this.
C# version:
So, in this question we are assuming that the array would be in sorted order or it doesn't matter ?
Same approach with incremental loop https://www.hackerrank.com/challenges/count-triplets-1/forum/comments/819013
can you please explain this code because i did not get it ?
You don't have to reverse it. instead just divide i by r(i/r). and forward loop it.
all you have to do is keep track of pairs to get your ans of triplets.
Can also be done forward without divisions :D
It was genius to iterate inputs in reverse order, so checking the next sequence can be done by multiplication, which lets the algorithm not need to check if the next sequence is still integer from dividing an input with r. And I was using one map with int[count, countOfNextSequence] as value, but it seems using two maps are clearer than my approach. And I can't believe this is a medium level problem XD lol I'm dying.
Can also be done without using reverse order and without division. See other threads.
Thanks for this awesome example! I was testing and you don't need to reverse the array :)
Thank you so much! Couldn't figure out what's the problem, until you mentioned it. I missed "i < j < k" part
No need to reverse the array.
Very beautiful solution, Thanks for sharing
Nice approach. But does it pass the test case ? a = [1,1,1,1]
Why wouldn't this work in the other direction, using integer division instead of multiplication?
It would make 1 a special case, but are there any other reasons your solution would be better?
This was really helpful but took me a while to understand what was going on. In the end tabulating the algorithm helped and I hope this saves some people time
Follows the
n=[4, 2, 1, 2, 1]
andr=2
example: https://ibb.co/fQ88K1CThanks for the explanations !
My Java code based on your solution:
This solution is extremely clever — or at least, it took me a while to understand it. :)
In case it is of use to anyone else, I rewrote the code with inline comments and descriptive variable names, which helped me to get it. I refer to a harmonic pair as two elements where the leftmost element is the base and the rightmost element is the head, and the the head = r * base.
Genius!
Why is this a medium problem lol
Love this simplicity.
I had tried doing a single dictionary and doing a nested check for the i < j < k part, but that blew up for execution speed.
does this assume the array is sorted? So higher numbers are always further in the array?
The array is NOT necessarily sorted. That's the whole underlying point of the challenge (and in most test cases, it's not).
so in this solution, isn't the idea of iterating backwards to hit the higher numbers first, so as you go you can look up through multiplying?
I must be missing something, because I looked at what it would do on paper for an array 27, 9, 3 and it looks like it would never add to the count.
Beatiful!
Somehow I got the idea after reading 1st paragraph of your comment thogh my implementation is different a bit. Thanks :)
I haven't written any Python, and using
i
as the variable name was tripping me up here for a minute. I was expecting this to be the iterator.I followed your approach to answer in Java but it doesn't work for the test case 3, with 100 000 times the same number and r = 1. I guess it's an edge case but I can't figure out the expected behavior when r = 1.
Thank you!!! I learned a lot from your beautiful solution and explaination.
it's a brilliant solution but I don't understand why you say that it needs to be walked in reverse. You can do exactly the same in the opposite direction, just look for i/r rather than i * r, or am I missing something?