Problem statement

Recent development of 3D sensing devices and techniques have made collecting 3D data more affordable and faster than ever. The increasing amount of 3D data can provide useful information for tasks with rich content. For example, 3D data, if semantic labelled, can be used for producing detailed 3D models. With a semantic understanding from 3D data, decision-making systems like automatic navigation and car parking can make better decisions. However, semantic labelling 3D data requires time-consuming manual work.


Deep learning, which is a class of machine learning algorithms, can serve as a tool for 3D data processing. GRID aims to develop a deep learning-based end-to-end solution for the following goals:

  • Classification. Classification means grouping objects into different categories. Classification is often followed by localisation, namely, producing bounding boxes for the classified objects. In some literature, classification and localisation refer to object detection. In the 2D domain, deep learning now dominates the classification task by implementing a set of customised architectures and techniques in processing images.
  • Segmentation. Segmentation is a classification task in a fine-grained level, namely, producing a label for each element (point cloud or voxel) of 3D objects. Compared to the bounding box, segmentation provides a more precise way of localisation by recognising the edges. Segmentation involves semantic segmentation and instance segmentation. The former only makes element-wise predictions for a scene, while the latter must identify each object separately. For example, when given the 3D data of an office, semantic assigns the same label to all the chairs. Instance segmentation, however, not only detects all the chairs but also recognises each chair individually.


Our team of experts aims to investigate and develop a deep learning-based solution for 3D data processing. The solution can handle classification and segmentation tasks. Expected deliverables can be listed as follow:

  • Detailed reports addressing identified issues and use cases for the stakeholders.
  • Proof of concepts including prototypes of software.
  • National and international publications to provide visibility to the parties involved.
  • Automatic semantic label generation for Scan-to-BIM process.