The Toshiba CAD model point clouds dataset consists of 12 shape classes. These are bearing1, bearing1_black, block1, bracket1, cog1, cog2, flange1, knob1, mini1, pipe1, piston1, and piston2. Each shape class contains 3D scans of a physical object at different poses. The physical object for each shape class was 3D printed from a known CAD model, as depicted in table 1.

Table 1: CAD models and physical objects of the 12 shape classes

Shape class CAD model Physical object
bearing1 bearing1 bearing1
bearing1_black bearing1 bearing1_black
block1 block1 block1
bracket1 bracket1 bracket1
cog1 cog1 cog1
cog2 cog2 cog2
flange1 flange1 flange1
knob1 knob1 knob1
mini1 mini1 mini1
pipe1 pipe1 pipe1
piston1 piston1 piston1
piston2 piston2 piston2

Dataset generation

The geometry of each physical object was scanned 20 times using the real-time 3D reconstruction approach of [3], with a different pose and a different viewing angle at each time. This process generated 20 shape instances for each shape class, in the form of point clouds. Along with the class label, every shape instance has an associated ground truth pose, computed by first approximately registering the relevant CAD model to the point cloud manually, then using the Iterative Closest Point algorithm [1] to refine the registration.

Given a test point cloud and set of training point clouds (with known class and pose), the computation of input pose votes poses is a two stage process. In the first stage, local shape features, consisting of a descriptor and a scale, translation and rotation relative to the object, are computed on all the point clouds.  This is done by first converting a point cloud to a 128-by-128-by-128 voxel volume using a Gaussian on the distance of each voxel centre to the nearest point. Then interest points are localized in the volume across 3D location and scale using the Difference of Gaussians operator, and a canonical orientation for each interest point is computed [2], to generate a local feature pose. Finally a basic, 31-dimensional descriptor is computed by simply sampling the volume (at the correct scale) at 31 regularly distributed locations around the interest point.

In the second stage each test feature is matched to the 20 nearest training features, in terms of Euclidean distance between descriptors. Each of these matches generates a vote Xi=AB-1C, for the test object's pose, AB and C being the test feature, training feature and training object's ground truth pose respectively. In addition each vote has a weight, λi, computed as (NCNI)-1, NC being the number of training instances in the class and NI the number of features found in the feature's particular instance.

The point clouds, the features, the votes, and additionally the lists of rotational symmetries for all shape classes are all available at the Downloads page.


[1] P. Besl and N. McKay. A method for registration of 3D shapes. TPAMI, 14(2), 1992.

[2] F. Tombari, S. Salti, and L. Di Stefano. Unique signatures of histograms for local surface description. In ECCV, 2010.

[2] G. Vogiatzis and C. Hernández. Video-based, real-time multi view stereo. Image and Vision Computing, 29(7):434–441, 2011.


The dataset was introduced in the following paper in a context of 3D object recognition using a vote-based approach:
srt_distance A New Distance for Scale-Invariant 3D Shape Recognition and Registration
Minh-Tri Pham, Oliver J. Woodford, Frank Perbet, Atsuto Maki, Björn Stenger, Roberto Cipolla
Cambridge Research Laboratory and University of Cambridge
Published in ICCV 2011

pdf [paper, 3.3MB] [project page]

To our knowledge, it was also used in the following papers:

direct_similarities_front Distances and Means of Direct Similarities
Minh-Tri Pham, Oliver J. Woodford, Frank Perbet, Atsuto Maki, Riccardo Gherardi, Björn Stenger
Cambridge Research Laboratory
Published in IJCV 2014

pdf [paper]
faq_front Full-Angle Quaternion for Robustly Matching Vectors of 3D Rotations
Stephan Liwicki, Minh-Tri Pham, Stefanos Zafeiriou, Maja Pantic, Björn Stenger
Cambridge Research Laboratory
Published in CVPR 2014

pdf [paper]
srt_distance_front Demisting the Hough Transform for 3D Shape Recognition and Registration
Oliver J. Woodford, Minh-Tri Pham, Atsuto Maki, Frank Perbet, Björn Stenger
Cambridge Research Laboratory
Published in BMCV 2011 and IJCV 2013

pdf [paper]

evaluation An Evaluation of Volumetric Interest Points
Tsz-Ho Yu, Oliver J. Woodford, Roberto Cipolla
Cambridge Research Laboratory and University of Cambridge
Published in 3DIM/3DPVT 2011

pdf [paper, 6.3MB]