We use cookies to ensure you have the best browsing experience on our website. Please read our cookie policy for more information about how we use cookies.
- Prepare
- Algorithms
- Strings
- Bear and Steady Gene
- Discussions
Bear and Steady Gene
Bear and Steady Gene
Sort by
recency
|
180 Discussions
|
Please Login in order to post a comment
My initial solution of simply searching through from the minimum possible length to the three-quarters of the length of gene (the maximum possible length of a sub-string requiring replace which would occur all the letters where the same; like a string of all 'A's) did not work; it was too slow for the later tests cases. It seems that other people have created sufficiently performant solutions using this approach in C++ but, I couldn't get it to work in Python. So, I binary-searched for the answer instead
`def works (o, L, l, dd): c = { 'A': 0, 'C': 0, 'G': 0, 'T': 0, }
def steadyGene(gene): L = len(gene) d = { 'A': 0, 'C': 0, 'G': 0, 'T': 0 }
If you imagine us initially having every character in the gene, we can 'reverse' the problem to try and remove the fewest characters possible, while still having the other characters be below the required limit of . This transforms the problem into a classic sliding window, where we need to find the smallest window to 'remove', such that the characters outside of the window are still valid.
Initially, we have a window with length , representing that we would need to remove a window of size to make the gene valid. Our goal is to minimise this - i.e., remove the fewest characters possible (the smallest window possible) while still having the gene valid.
This works, because we can always 'add' characters from the window we're removing, but not subtract them. Therefore, the condition for our window being valid is that the characters outside of the window all have a count less than (or equal to) the goal (to be supplemented with the characters inside the window). This can be simplified (in code) to simply track the max count of the characters.
include
using namespace std;
int test(int n, string s) { int required_count = n / 4; map mp; for (char it : s) { mp[it]++; } bool ok = true; for (auto tam : mp) { if (tam.second > required_count) { ok = false; break; } } if (ok) return 0; int min_val = n; int start = 0; for (int end = 0; end < n; end++) { mp[s[end]]--; while (mp['A'] <= required_count && mp['C'] <= required_count && mp['G'] <= required_count && mp['T'] <= required_count) { min_val = min(min_val, end - start + 1); mp[s[start]]++; start++; } } return min_val; }
int main() { int n; cin >> n; string s; cin >> s; cout << test(n, s) << endl; return 0; }
O(n) cpp solution, using two indices to keep track of min substrings, and pushing both indices until we find a new valid substring.
I find the optimal length of the substring to be replaced using binary search and for a given substring length I swipe through all possible substrings. The length of the substring to be replaced must be larger than the number of letters which I need to add to the rest of the string to compensate for the unequal number of letters in the latter. The solution scales as N*log(N)