4
votes

I am working on a machine learning project using YOLO. I am creating my own dataset following the guide found here (at section How to train (to detect your custom objects)). For the bounding boxes I need to know the [x] [y] [width] [height] of each object I want to train YOLO on in a given picture. So far I have been finding this by hand but it is becoming very time consuming. I was hoping to get some help writing a script that could calculate this for me. I know opencv has some great tools for image manipulation but don't know where to begin for finding the object coordinates.

3

3 Answers

1
votes

In the page you mention there is a section that contains a link to a tool to do these boxes:

How to mark bounded boxes of objects and create annotation files:

Here you can find repository with GUI-software for marking bounded boxes of objects and generating annotation files for Yolo v2: https://github.com/AlexeyAB/Yolo_mark

0
votes

I also face the same problem but in my case the data was video and the background was the same so I have done background subtraction you can try this code by adjusting some threshold may be you can get what you want

import cv2
import numpy as np 

# read and scale down image
# wget https://bigsnarf.files.wordpress.com/2017/05/hammer.png
img = cv2.pyrDown(cv2.imread('hammer.png', cv2.IMREAD_UNCHANGED))

# threshold image
ret, threshed_img = cv2.threshold(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY),
                127, 255, cv2.THRESH_BINARY)
# find contours and get the external one
image, contours, hier = cv2.findContours(threshed_img, cv2.RETR_TREE,
                cv2.CHAIN_APPROX_SIMPLE)

# with each contour, draw boundingRect in green
# a minAreaRect in red and
# a minEnclosingCircle in blue
for c in contours:
    # get the bounding rect
    x, y, w, h = cv2.boundingRect(c)
    # draw a green rectangle to visualize the bounding rect
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)

    # get the min area rect
    rect = cv2.minAreaRect(c)
    box = cv2.boxPoints(rect)
    # convert all coordinates floating point values to int
    box = np.int0(box)
    # draw a red 'nghien' rectangle
    cv2.drawContours(img, [box], 0, (0, 0, 255))

    # finally, get the min enclosing circle
    (x, y), radius = cv2.minEnclosingCircle(c)
    # convert all values to int
    center = (int(x), int(y))
    radius = int(radius)
    # and draw the circle in blue
    img = cv2.circle(img, center, radius, (255, 0, 0), 2)

print(len(contours))
cv2.drawContours(img, contours, -1, (255, 255, 0), 1)

cv2.imshow("contours", img)

ESC = 27
while True:
    keycode = cv2.waitKey()
    if keycode != -1:
        keycode &= 0xFF
        if keycode == ESC:
            break
cv2.destroyAllWindows()
0
votes

Here some part from source code of Yolo-mark-pwa, as you can see, it much more readable then the original Yolo_mark (click github icon at right corner, after that check src/utils/createExportCord.ts, src/utils/readExportCord.ts). The naturalWidth and naturalWidth is a image size, height and width is a blue rect size.

namespace mark {

  export namespace utils {

    export const createExportCord = ({
      name, height, width, top, left, naturalHeight, naturalWidth
    }) => {
      console.log({name, height, width, top, left, naturalHeight, naturalWidth});

      const x = (left + (width/2)) / naturalWidth;
      const y = (top + (height/2)) / naturalHeight;
      const w = width / naturalWidth;
      const h = height / naturalHeight;

      return [name, x, y, w, h].join(' ');
    }

  } // namespace utils

} // namespace mark

enter image description here