3
votes

I'm using tensorflow for object detection in the webcam and further I have to identify the coordinates of each of the identified object in the image. When I print the bounding boxes i.e. "boxes:" what i see a array of array and if I pull the first array within boxes array it gives [ymin,xmin,ymax,xmax] which is the coordinates of first object what i guess.

My Question: if there are three objects identified: Person, Chari, Backpack then from the boxes array how to get the coordinates for each of the identified objects

# Each box represents a part of the image where a particular object was detected.
    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')

# Actual detection.
      (boxes, scores, classes, num) = sess.run(
          [detection_boxes, detection_scores, detection_classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})

Coordinates of the first Object

      ymin = int((boxes[0][0][0]*height))
      xmin = int((boxes[0][0][1]*width))
      ymax = int((boxes[0][0][2]*height))
      xmax = int((boxes[0][0][3]*width))

Boxes Array

[[[3.98103416e-01 2.58287102e-01 9.33272898e-01 7.91557074e-01]
  [5.12995601e-01 2.65186965e-01 8.48852396e-01 4.13343787e-01]
  [3.66972327e-01 7.32781529e-01 7.90232778e-01 9.99156952e-01]
  [3.98981631e-01 7.82709479e-01 7.35594213e-01 9.70018148e-01]
  [6.31434858e-01 6.56596124e-02 9.94844258e-01 9.91125226e-01]
  [4.03546631e-01 3.72542024e-01 8.93805206e-01 7.11475253e-01]
  [4.92211223e-01 2.52640486e-01 7.18696833e-01 4.07687068e-01]
  [6.85503840e-01 4.34535980e-01 8.68745089e-01 4.70350862e-01]
  [3.38426471e-01 5.05429626e-01 3.66816342e-01 5.16116381e-01]
  [5.56789041e-01 0.00000000e+00 8.72881413e-01 6.77806586e-02]
  [7.46141136e-01 4.60771352e-01 8.83682549e-01 5.43849111e-01]
  [3.97113353e-01 4.10500914e-01 4.45086926e-01 4.24319774e-01]
  [3.31506431e-01 4.76335794e-01 3.56430948e-01 4.88202840e-01]
  [5.03257632e-01 2.41804078e-01 6.15983486e-01 3.58820289e-01]
  [4.19946134e-01 1.86517924e-01 9.83408988e-01 9.87705827e-01]
  [5.36903977e-01 2.56856382e-01 6.54782653e-01 3.43218327e-01]
  [4.12004054e-01 5.86636543e-01 6.93381369e-01 1.00000000e+00]
  [5.20348430e-01 2.73481667e-01 6.10880256e-01 3.93006027e-01]
  [5.74276567e-01 2.87748992e-01 7.21984267e-01 3.91383827e-01]
  [2.89862007e-01 2.84560740e-01 7.39836216e-01 6.37443185e-01]
  [4.96246338e-01 2.63413846e-01 7.70015240e-01 4.57567930e-01]
  [5.19865811e-01 1.98258713e-01 6.03035390e-01 2.73524016e-01]
  [4.83228505e-01 2.16619462e-01 6.39268816e-01 4.49601918e-01]
  [3.84083927e-01 4.67547536e-01 4.31649745e-01 4.81992364e-01]
  [3.48392308e-01 3.74274254e-01 5.03999054e-01 6.12540483e-01]
  [5.74921966e-01 3.47678363e-03 9.15780425e-01 2.38593623e-01]
  [3.45435590e-01 5.46659231e-01 3.74062091e-01 5.60205579e-01]
  [6.14662647e-01 1.25400722e-02 9.80371952e-01 4.75324601e-01]
  [6.86917663e-01 4.48385119e-01 8.99734139e-01 6.21741176e-01]
  [5.43244362e-01 2.79347658e-01 6.78029895e-01 3.91894460e-01]
  [3.64326298e-01 0.00000000e+00 6.73295557e-01 7.37453580e-01]
  [6.07542276e-01 5.06267175e-02 7.68457413e-01 2.98367083e-01]
  [2.58686990e-01 4.09406722e-02 7.81508684e-01 6.48879886e-01]
  [4.82539684e-01 2.30819955e-01 5.28611958e-01 3.63296390e-01]
  [4.15112287e-01 7.37580359e-01 7.52923012e-01 9.21294510e-01]
  [3.98340166e-01 4.32618022e-01 7.49099314e-01 6.08014822e-01]
  [5.01145720e-01 3.60702038e-01 6.23760223e-01 4.31556940e-01]
  [4.80772495e-01 2.87768871e-01 5.28597713e-01 3.74635667e-01]
  [7.58994102e-01 5.31462878e-02 8.83683920e-01 3.54582548e-01]
  [7.33969390e-01 4.40615416e-01 8.75839055e-01 4.80525196e-01]
  [3.32216144e-01 4.08021301e-01 5.61710179e-01 5.81104398e-01]
  [5.27267098e-01 2.43273854e-01 8.85313392e-01 4.95633841e-01]
  [6.36232436e-01 1.87747717e-01 8.11940730e-01 2.77306348e-01]
  [5.35063982e-01 3.03871930e-04 9.76815224e-01 1.71860784e-01]
  [5.52567542e-01 3.21750902e-03 6.88723385e-01 6.52680770e-02]
  [5.18970549e-01 1.09627441e-01 8.73319566e-01 4.07769084e-01]
  [4.70146358e-01 3.06456447e-01 6.97903335e-01 4.66317654e-01]
  [4.12567884e-01 2.80987918e-01 8.80960703e-01 6.27162635e-01]
  [5.88587642e-01 2.44935393e-01 8.71819854e-01 3.90277863e-01]
  [5.04896283e-01 5.80270052e-01 7.68337965e-01 1.00000000e+00]
  [4.37421232e-01 3.08278799e-01 7.68657207e-01 7.64984012e-01]
  [5.28930783e-01 3.55813205e-01 8.43232036e-01 5.18654346e-01]
  [7.06788957e-01 4.37110692e-01 9.14967120e-01 5.72017550e-01]
  [2.18201816e-01 4.88946289e-01 8.62469971e-01 9.63312864e-01]
  [6.22738302e-01 1.38248444e-01 8.57222259e-01 2.82573849e-01]
  [6.54138923e-01 3.04315478e-01 7.45888233e-01 3.82857710e-01]
  [4.29467261e-01 7.80698359e-01 6.48266017e-01 1.00000000e+00]
  [4.46878254e-01 2.09028199e-01 6.40358984e-01 3.75346243e-01]
  [3.16258848e-01 8.32504749e-01 8.23982418e-01 9.90386963e-01]
  [6.05024457e-01 2.52521902e-01 8.12417507e-01 4.83819872e-01]
  [3.51336688e-01 4.98775810e-01 1.00000000e+00 9.06152248e-01]
  [6.95564687e-01 7.50948310e-01 8.88447940e-01 1.00000000e+00]
  [3.88258576e-01 5.33289671e-01 8.75454664e-01 9.59217429e-01]
  [3.02734107e-01 5.05446017e-01 3.32951814e-01 5.21313846e-01]
  [5.78578115e-01 2.70111322e-01 7.26399779e-01 3.33653688e-01]
  [8.31338286e-01 2.76189297e-02 9.03123498e-01 3.93661559e-01]
  [5.81698239e-01 1.67757824e-01 8.35726202e-01 2.95289814e-01]
  [5.36950946e-01 3.19106877e-02 9.08833861e-01 7.96007037e-01]
  [3.30216944e-01 1.85503408e-01 6.03329062e-01 6.20327592e-01]
  [6.18818402e-01 0.00000000e+00 7.89601564e-01 1.48627713e-01]
  [6.30266726e-01 4.33205903e-01 8.52360308e-01 5.72274148e-01]
  [3.35949063e-01 4.14158136e-01 3.57951701e-01 4.26587313e-01]
  [7.20640838e-01 8.04916024e-03 9.58042920e-01 5.61199069e-01]
  [5.84677398e-01 7.03474283e-02 7.36290991e-01 3.44596863e-01]
  [4.19883460e-01 4.06788111e-01 4.68255132e-01 4.25109506e-01]
  [6.39927804e-01 2.00164318e-03 9.97901261e-01 8.11412573e-01]
  [5.33949196e-01 2.72342533e-01 8.78523052e-01 4.32817489e-01]
  [6.58142984e-01 4.44802016e-01 7.81714380e-01 5.50549507e-01]
  [3.07564199e-01 3.41288984e-01 6.50976717e-01 6.01556361e-01]
  [4.40935194e-01 1.88440084e-02 9.83413517e-01 9.33977723e-01]
  [4.89288419e-01 0.00000000e+00 1.00000000e+00 1.12769425e-01]
  [7.88989484e-01 2.11772084e-01 1.00000000e+00 3.67680490e-01]
  [5.96338212e-01 3.88508767e-01 6.47209346e-01 4.74365383e-01]
  [3.38769794e-01 3.37847918e-01 8.24353933e-01 5.76343238e-01]
  [4.90569234e-01 1.95525885e-01 7.80579686e-01 3.27015400e-01]
  [5.47312975e-01 3.57423306e-01 6.47958755e-01 4.23456550e-01]
  [6.18103385e-01 6.21652603e-02 8.44633698e-01 2.22137511e-01]
  [5.27049541e-01 1.17950372e-01 6.17353082e-01 1.53708220e-01]
  [7.16908813e-01 4.01781499e-03 9.84125972e-01 3.00909281e-01]
  [5.44539571e-01 2.41813436e-03 8.41970801e-01 1.26415297e-01]
  [4.44469690e-01 8.28753531e-01 7.46623278e-01 9.87598717e-01]
  [3.15123707e-01 4.62584257e-01 3.35273296e-01 4.77230787e-01]
  [3.47529173e-01 3.27103466e-01 7.68559933e-01 9.83226418e-01]
  [4.02757078e-01 3.62969428e-01 4.32684571e-01 3.75950783e-01]
  [7.05794036e-01 1.78262621e-01 9.70153272e-01 2.98915476e-01]
  [5.08592129e-01 7.52241552e-01 7.34334707e-01 9.93464291e-01]
  [1.42102808e-01 0.00000000e+00 4.66604441e-01 7.86211610e-01]
  [5.85047305e-01 1.92644715e-01 6.23101652e-01 2.79952288e-01]
  [6.69153452e-01 7.77340114e-01 7.54663706e-01 8.55158150e-01]
  [3.93155396e-01 9.21279490e-01 7.05600083e-01 9.97392118e-01]]]
Counter({'person': 1, 'chair': 1, 'backpack': 1}) 2018-09-20 17:07:59.300454
1
Do you get a solution for it? I have also a similar requirement. If you got a solution please share itK K
@KK Nope I didn't found it but I have found a work around for my problem.min2bro
can you please share itK K

1 Answers

2
votes

You can try this

image = Image.open(image_path)
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
image_np = load_image_into_numpy_array(image)  
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# Actual detection.
output_dict = run_inference_for_single_image(image_np, detection_graph)

width = 1024
height = 600
threshold = 0.5

for i,j in zip(output_dict['detection_boxes'],output_dict['detection_scores']):
    if(j>threshold):
        print(i[1]*width,i[0]*height,i[3]*width,i[2]*height)

This will output the boxes in the order of xmin,ymin,xmax,ymax. I took 0.5 as the min_score_threshold. Hope this helped you. Please let me know in case you face any issues