Final Report
Check here: https://github.com/kashyap07/currency-detector-opencv
CURRENCY DETECTION FOR THE BLIND
Abstract —
The technique is as follows —
We used opencv3 - python3 for programming. For image acquisition we used a
scanner and pre-processed it - crop it. This was included in our training set. For the testing set we captured photos from a phone and used it[1].
Next up was to detect the corners of the rectangle (the currency note), extract them by smoothing the image, canny detectors, sobel operator and contour detecting — this is a part which we couldn’t succeed in. So the final part works with a caveat that the testing image must always be a clear top down
image.
The next part included feature extraction and matching — for which we used opencv ORB — which is an efficient alternative to SIFT. We use ORB to extract keypoints and descriptors of images (training and testing). To match we use opencv Brute Force Matcher (While it’s not a good thing to use, it solves our purpose)[2].
This then is passed on to the k-nearest-neightbour operator which detects multiple matching features. We discard 3/4 of them with smaller ratios and include only the ones with god ratios. We then set a threshold to which we compare the
Abstract —
As the name suggests, the problem we are trying to solve is a simple one — to detect a paper currency and output the result as speech. We take in an image — maybe from the webcam, use video stream from a phone or a simple photo that is taken and shared to the computer — the testing data. We extract certain features from this and match them against a set of training data. We finally match them and say if there is a match and if there is, say what denomination the input image’s note is of. Finally we output the result in the form of voice.
The technique is as follows —
We used opencv3 - python3 for programming. For image acquisition we used a
scanner and pre-processed it - crop it. This was included in our training set. For the testing set we captured photos from a phone and used it[1].
Next up was to detect the corners of the rectangle (the currency note), extract them by smoothing the image, canny detectors, sobel operator and contour detecting — this is a part which we couldn’t succeed in. So the final part works with a caveat that the testing image must always be a clear top down
image.
The next part included feature extraction and matching — for which we used opencv ORB — which is an efficient alternative to SIFT. We use ORB to extract keypoints and descriptors of images (training and testing). To match we use opencv Brute Force Matcher (While it’s not a good thing to use, it solves our purpose)[2].
This then is passed on to the k-nearest-neightbour operator which detects multiple matching features. We discard 3/4 of them with smaller ratios and include only the ones with god ratios. We then set a threshold to which we compare the
no. of matches and which if above, take the largest matching training image and announce it to be the matching one.
The final part is where we use the voice output. To do so we use gTTS - Google Text to Speech. This outputs whatever the input string is.
Role/Contribution —
While it was a combined effort since it required that we do put in, we’ll include the roles of members.
01FB14ECS255 - Suhas Kashyap S - Utility functions + ORB approach of image detection and voice output + initial setup and management.
01FB14ECS267 - Tejas S. K. - Usage of mask and image segmentation with classifiers and research on it[3].
01FB14ECS268 - Harshith U. K. - Image collection and preprocessing + some utility functions + edge detection for warping and getting tops down view of image[4].
[1] We further want to try improve this- use webcam or use android camera video stream as input for the data and instead of static images — to make the app real-time.
[2] To further improve we can create masks to extract only the certain parts/features/ segments to give importance to such as the part that says what denomination this is and certain watermarks to determine if the currency is counterfeit. The mask is the second argument to orb.detectAndCompute()
[3], [4] Though these did not work, a lot of hard work has gone through for this and we’ve included a try.py that includes some of these failed parts.
The final part is where we use the voice output. To do so we use gTTS - Google Text to Speech. This outputs whatever the input string is.
Role/Contribution —
While it was a combined effort since it required that we do put in, we’ll include the roles of members.
01FB14ECS255 - Suhas Kashyap S - Utility functions + ORB approach of image detection and voice output + initial setup and management.
01FB14ECS267 - Tejas S. K. - Usage of mask and image segmentation with classifiers and research on it[3].
01FB14ECS268 - Harshith U. K. - Image collection and preprocessing + some utility functions + edge detection for warping and getting tops down view of image[4].
[1] We further want to try improve this- use webcam or use android camera video stream as input for the data and instead of static images — to make the app real-time.
[2] To further improve we can create masks to extract only the certain parts/features/ segments to give importance to such as the part that says what denomination this is and certain watermarks to determine if the currency is counterfeit. The mask is the second argument to orb.detectAndCompute()
[3], [4] Though these did not work, a lot of hard work has gone through for this and we’ve included a try.py that includes some of these failed parts.