Optical 3D position tracking with OpenCV and ArUco markers.


Continuing from experimenting with static magnetic fields and hall effect sensors for positioning, this is an attempt at using optical tracking for the same purpose. Optical tracking is a proven technology used in many VR headsets such as Oculus Rift and PlayStation VR but due to its slow refresh rate limited by the camera’s frame rate, it has to be used with an accelerometer and gyroscope to provide faster updates and optical tracking is used to correct for drift introduce by the IMU. The accuracy of optical tracking depends on many factors such as lighting conditions, image recognition algorithms, processing power, camera, and the design of markers with the later three contributing directly to BOM. So here we look at what could be achieved using a standard laptop webcam and printed QR/ArUco codes.

ArUco markers are a type of QR codes that can be used to measure the position and pose in 3D space when information about the camera lens and the size of the marker are known. OpenCV comes with an ArUco module that can calibrate the camera and generate ArUco markers. Getting OpenCV to compile with these modules is not easy to say the least, and hopefully, these steps help someone.

First the camera needs to be calibrated  by generating and printing a ChArUco board  with g++ -ggdb `pkg-config –cflags –libs opencv3` create_board_charuco.cpp -o create_board_charuco.o && ./create_board_charuco.o  -w=5 -h=7 -sl=200 -ml=120 -d=10 charuco_board.png  Now it can be used to generate the camera/lens characteristics yml file with g++ -ggdb `pkg-config –cflags –libs opencv3` calibrate_camera_charuco.cpp -o calibrate_camera_charuco.o && ./calibrate_camera_charuco.o detector_params.yml -d=0 -w=5 -h=7 -sl=0.04 -ml=0.02 The last two params are measured square side length and marker side length in meters of the board previously printed.

Now OpenCV has everything it needs to detect markers and estimate pose. The code  below is a simplified excerpt from detect_markers.cpp

[image with id]

The following example will two independent objects relative to each other (a point relative to the head will be tracked). In order to track all sides of the head individual markers are attached to front and sides and when detected, they rotated and transformed to a common point in the centre. This will ensure the same physical location is returned regardless of which marker is visible.

The front two markers are moved backwards along the z-axis to the centre of the head and moved down along the y-axis so they are on the same plane as the side markers. The side markers are moved along the z-axis to the centre and rotated to face the same direction as the front markers.

The coordinates and direction of the two objects are written to stdout as JSON and sent to a browser via web sockets and node.js. Then using three.js a scene is created and rendered after some post-processing using a Kalman filter to remove noise.

Finally while not well suited for the purpose of tracking two objects in 3D space, OpenCV is an impressive framework with many built-in functions for everything from video effects to object recognition with neural networks. Full source here and pros and cons below.

– Fast tracking.
– Easy to set up.
– Works well in low light (MacBook Pro webcam)
– Low cost.

– Outside-in tracking, requiring extra equipment (camera).
– Easy to go out of frame when FOV is limited.
– Requires line of sight (when tracking two objects one would often cover the other)
– Markers need to be perfectly flat and monochrome.
– Each camera model needs to be calibrated.

Leave a Reply

Your email address will not be published. Required fields are marked *