• + 0 comments

    Solved this by using a dictionary that I created offline. The keys are every character's segmentation mask (a list of lists converted to a string representation to make it hashable) and the values are the corresponding characters.

    The creation of the segmentation (binary) masks for every character involved going through every sample image to do the following.

    1) The image of the characters in a sample image were extracted one at a time. Because of the problem constraints, the bounding box coordinates were known.

    2) The BGR pixel values of an extracted character image were converted to greyscale, and then to binary by using an appropriate threshold on the greyscale value. The binary values were put into a list (cols) of lists (rows) representing the binary image of a single character.

    3) The corresponding character from the output text file was read. If not already present, the character was added to the dictionary, with the binary image as the key.

    By the end of it all, there were 36 entries in the dictionary as expected.

    Then, for a given test image, every character was extracted just like in step (1), the binary image computed like in step (2). Then the corresponding character was got by looking up the dictionary using the binary image.