We use cookies to ensure you have the best browsing experience on our website. Please read our cookie policy for more information about how we use cookies.
Not exactly slower, if we look at the constants behind O-s. Say, we sort events, there are 4e5 of them, so we get log(4e5)*4e5/log(2)=7.44e6, the difference became not so drastic. And sort at any step uses conditional operations that, when incorrectely predicted, break the CPU conveyer. That costs about 10 ticks each time. And they would be incorrectely predicted in about 50% cases, that is expensive. The proposed approach is straightforward, the only conditional is "if ( max < x ) max = x" that probably would be saturated fast, that is in most cases it would not be taken, and CPU would be able to predict that.
About numbers: my sorting approach runs in 0.01-0.02s, this approach in my implementation runs in 0.03-0.04s on strong testcases. That is probably because we have to use quite a lot of (random access) memory, 80MB instead of 3.2MB in the sorting approach, and it leads to cache misses and TLB trashing. HugeTLB could help a bit, had it been turned on on the testing server.
Cookie support is required to access HackerRank
Seems like cookies are disabled on this browser, please enable them to open this website
Array Manipulation
You are viewing a single comment's thread. Return to all comments →
Not exactly slower, if we look at the constants behind O-s. Say, we sort events, there are 4e5 of them, so we get log(4e5)*4e5/log(2)=7.44e6, the difference became not so drastic. And sort at any step uses conditional operations that, when incorrectely predicted, break the CPU conveyer. That costs about 10 ticks each time. And they would be incorrectely predicted in about 50% cases, that is expensive. The proposed approach is straightforward, the only conditional is "if ( max < x ) max = x" that probably would be saturated fast, that is in most cases it would not be taken, and CPU would be able to predict that.
About numbers: my sorting approach runs in 0.01-0.02s, this approach in my implementation runs in 0.03-0.04s on strong testcases. That is probably because we have to use quite a lot of (random access) memory, 80MB instead of 3.2MB in the sorting approach, and it leads to cache misses and TLB trashing. HugeTLB could help a bit, had it been turned on on the testing server.