You are provided with four documents, numbered 1 to 4, each with a single sentence of text. Determine the identifier of the document which is the most similar to the first document, as computed according to the TF-IDF scores.
- I'd like an apple.
- An apple a day keeps the doctor away.
- Never compare an apple to an orange.
- I prefer scikit-learn to orange.
Output the integer (which may be either 2 or 3 or 4), leaving no leading or trailing spaces.
You may either compute the answer manually and submit it in plain-text mode, or submit a program which computes the answer, in a language of your choice.