I've been working on a project involving image processing for logo detection. Specifically, the goal is to develop an automated system for a real-time FedEx truck/logo detector that reads frames from a IP camera stream and sends a notification on detection. Here's a sample of the system in action with the recognized logo surrounded in the green rectangle.
Some constraints on the project:
- Uses raw OpenCV (no deep learning, AI, or trained neural networks)
- Image background can be noisy
- The brightness of the image can vary greatly (morning, afternoon, night)
- The FedEx truck/logo can have any scale, rotation, or orientation since it could be parked anywhere on the sidewalk
- The logo could potentially be fuzzy or blurry with different shades depending on the time of day
- There may be many other vehicles with similar sizes or colors in the same frame
- Real-time detection (~25 FPS from IP camera)
- The IP camera is in a fixed position and the FedEx truck will always be in the same orientation (never backwards or upside down)
- The Fedex Truck will always be the "red" variation instead of the "green" variation
Current Implementation/Algorithm
I have two threads:
- Thread #1 - Captures frames from the IP camera using
cv2.VideoCapture()
and resizes frame for further processing. Decided to handle grabbing frames in a separate thread to improve FPS by reducing I/O latency sincecv2.VideoCapture()
is blocking. By dedicating an independent thread just for capturing frames, this would allow the main processing thread to always have a frame available to perform detection on. - Thread #2 - Main processing/detection thread to detect FedEx logo using color thresholding and contour detection.
Overall Pseudo-algorithm
For each frame:
Find bounding box for purple color of logo
Find bounding box for red/orange color of logo
If both bounding boxes are valid/adjacent and contours pass checks:
Combine bounding boxes
Draw combined bounding boxes on original frame
Play sound notification for detected logo
Color thresholding for logo detection
For color thresholding, I have defined HSV (low, high) thresholds for purple and red to detect the logo.
colors = {
'purple': ([120,45,45], [150,255,255]),
'red': ([0,130,0], [15,255,255])
}
To find the bounding box coordinates for each color, I follow this algorithm:
- Blur the frame
- Erode and dilate the frame with a kernel to remove background noise
- Convert frame from BGR to HSV color format
- Perform a mask on the frame using the lower and upper HSV color bounds with set color thresholds
- Find largest contour in the mask and obtain bounding coordinates
After performing a mask, I obtain these isolated purple (left) and red (right) sections of the logo.
False positive checks
Now that I have the two masks, I perform checks to ensure that the found bounding boxes actually form a logo. To do this, I use cv2.matchShapes()
which compares the two contours and returns a metric showing the similarity. The lower the result, the higher the match. In addition, I use cv2.pointPolygonTest()
which finds the shortest distance between a point in the image and a contour for additional verification. My false positive process involves:
- Checking if the bounding boxes are valid
- Ensuring the two bounding boxes are adjacent based on their relative proximity
If the bounding boxes pass the adjacency and similarity metric test, the bounding boxes are combined and a FedEx notification is triggered.
Results
This check algorithm is not really robust as there are many false positives and failed detections. For instance, these false positives were triggered.
While this color thresholding and contour detection approach worked in basic cases where the logo was clear, it was severely lacking in some areas:
- There is latency problems from having to compute bounding boxes on each frame
- It occasionally false detects when the logo is not present
- Brightness and time of day had a great impact on detection accuracy
- When the logo was on a skewed angle, color threshold detection worked but was unable to detect the logo due to the check algorithm.
Would anyone be able to help me improve my algorithm or suggest alternative detection strategies? Is there any other way to perform this detection since color thresholding is highly dependent on exact calibration? If possible, I would like to move away from color thresholding and the multiple layers of filters since it's not very robust. Any insight or advice is greatly appreciated!