This is my first blog post in 6 months, honestly I was extremely busy with graduate school applications. However, I plan to be back in writing blog posts more frequently.
Today, I will be talking about how to detect a hula-hoop using a camera and estimate its relative position from the camera. This method I will be talking about is currently being developed in the Cyclone project in which I work on an autonomous racing drone using C++ and OpenCV. The detection of a hula-hoop can be very beneficial for instance when developing a drone that tries to go within the hoop or avoids it when racing. The method presented here is not optimal, but is good practice for beginners in the computer vision field. This method assumes that the camera used is properly calibrated, that the hoop is filled with red LEDs, and that the camera is facing the center of the hoop.
First of all, the image (frame from the live video) is converted into HSV (Hue, Saturation, Value) color space, which is more useful than the BGR image that comes straight out of the camera, as in this representation the intensity information is separated from color, which makes thresholding much easier. This in turn reduces the effect of light intensity changes while tracking the target . The LEDs that need to be detected have a consistently high luminosity, so by setting a relatively narrow threshold at a high intensity level, already most of the unwanted pixels can be discarded. Empirically and through research, the following values are excellent ranges for the red color in HSV:
H = [45,255]
S = [170,245]
V = [170,255]
This thresholding will allow to detect only the red points in the frame, with the following results:
As you can see, the red LEDs are extracted (we discard the green color in our example).
In order to be able to apply circle fitting, the edges of the detected image need to be extracted, which is done by using a Canny edge extraction. This canny edge extraction is a built-in function inside OpenCV.
Finally, an ellipse is fit over all the coordinate points remaining in the array which describes the position of the hula-hoop on the image. This is done using the function fitEllipse from OpenCV. This function returns the height of the detected ellipse as well as the width and the angle ω of tilt of the major axis. From this, we can extract k, which is the length of minor axis (b) divided by the length of major axis (a). In theory, these are the values extracted from the function:
The following figure shows the result of the detection on our hoop:
Great, now we have detected the ellipse as needed and have some important values such as the angle and k. Now, in order to get the position in cartesian coordinates, it becomes a little tricky. Here, we assume that the origin is the center of the hoop, x is the horizontal axis (left and right from the center), y is the vertical axis (up and down from the center) and z is the axis from the camera facing the center of the hoop.
We wish to determine the relative position (x,y and z) of the hoop with respect to the camera, where the origin is the center of the hoop.
In order to do so, we need to determine the distance from the camera to the hoop as well as the vertical and horizontal angle of the hoop. Indeed, we need the following values:
α2: Angle of rotation about the camera frame's vertical axis between the disc axis and the horizontal direction of view of the camera (0 = no rotation; 90 degrees = viewed left edge-on, disc facing right). This is the horizontal angle.
β2: Angle of rotation about the camera frame's horizontal axis ('lie back') between the disc axis and the real world horizontal plane (0 = no rotation, disc is vertical; 90 degrees = disc is flat, facing up). This is the vertical angle.
D: Distance from the camera to the center of the hoop.
Determining the distance
First, the measurements of the hoop and the focal length of the camera lens have to be performed. The focal length, in turn can be determined with the following equation:
F = PD/R
where F is the focal length, R is radius of the hoop in meters, D is the distance at which the reference shot is taken and P is the representation of the radius of the hoop in pixels in a captured frame.
In order to determine the distance to the object, the semi-major axis of the ellipse in the captured photo frame is compared to the ellipse (circle) in the reference frame that represents the perceived hoop at a distance of 1 meter with a perpendicular view. The distance can then be determined using the focal length F calculated with the following equation:
d = FR/pmajor
where pmajor is the semi-major axis of the perceived ellipse.
Determining the angles
The following source  suggests that, using k and ω, we can comput the following horizontal and vertical angles:
The math behind this is a bit complicated so I will avoid going into the details.
Determining the position
Using the computed values, we can now compute the values of x, y and z:
x = sin(β2) * cos(α2) * d
y = sin(α2) * d
z = cos(α2) * d
And that’s it! If the assumptions stated at the beginning of this post hold, then the output results should be correct.
A small trick: This method proved to be inaccurate sometimes when looking at the hoop from the center (where x and y should be nearly 0). This is due to the fact that the tilted angle returns inaccurate result because the detection does not know which axis is the major and which axis is the minor. Hence, whenever k (remember k is the minor axis divided by the major axis) is nearly 1 then we can assume that the ellipse is a circle, and thus that we are looking at it from the center. As a result, we can then assume that x and y have a value of 0. This is not the best way to fix this, but it works for most of the cases.
Ahmed Ahres, 21.