The goal of this project was to demonstrate a system incorporating so called hand-eye coordination - the coordination between a camera and robot arm. The idea was to arbitrarily place a set of letters on the table, and provided the camera is overlooking them, the robot arm should pick up and order the letters the user has entered. The position and orientation of the letters is completely arbitrary as long as they are within the robot arm workspace and camera view. Also, it is assumed that the letters are face up with no overlap. There are two main components of the project: the computer vision and image processing component and the robot manipulation component. Here we present the first component, which involves camera calibration, letter recognition as well as the determination of the letter position and orientation relative to the robot base coordinate system. Once the letters are recognized and their positions and orientations are known the robot arm is directed to pick up and order selected letters.
For this project we used wooden scrabble letters. First we constructed a database of letters by scanning the scrabble letters (Fig. 1) and then processing them to obtain a set of binary images containing only the letters (Fig. 2), i.e. with any additional features removed.
For the purposes of the camera calibration we drew on the table the "x" and "y" axis of a 2D coordinate system, which can be seen in Figs 3, 4, and 5. The 2D coordinate system was placed at a known position and with a known orientation relative to the robot base coordinate system. Since all the letters were in one plane (i.e. placed on the table), we were able to use a single camera for the system calibration and determination of the 3D letter position and orientation in the coordinate system of the robot base.
The first step in letter recognition was edge detection (Figs. 3, 4, and 5, top middle), from which we automatically removed the coordinate system and isolated letters. Isolated letters J, X, and P can be seen in the upper right corners of Figs. 3, 4, and 5, respectively. For each isolated letter we detected two top adjacent borders of the wooden block (Figs. 3-5, middle left). The two borders defined a local 2D coordinate system, which allowed us to back-project the letter (Figs. 3-5, middle center). To recognize the letter we tested the four possible orientations of the letter (Figs. 3-5, middle right and bottom row) against the database of letters. In each tested case the system correctly recognized the letter.
The 2D coordinate system (shown in Figs. 3, 4, and 5) was automatically detected and processed to determine its origin, orientation of the two axes and their unit steps (indicated by tick-marks; each axis had two tick-mars). Since the 2D coordinate system had known unit steps, position and orientation relative to the coordinate system of the robot base, and since all the letters were placed on the table, for any given point in the camera image we were able to compute its 3D coordinate in the robot base coordinate system. The 2D coordinate system with superimposed grid is shown in Fig. 6.
Once the letters were recognized and their positions and orientations determined in the robot base coordinate system, the robot arm was directed to pick up and order the letters forming user specified words. For this experiment we used a Connectix Quickcam with 320 x 240 pixels, and Mitsubishi Move Master EX (model RV-M1) robot arm with five revolute joints.