In this paper, we address the task of taxonomic classification of diatoms from images taken under a light microscope. The corresponding machine learning task is the task of hierarchical multi-label classification, where the taxonomy plays the role of the label hierarchy. More specifically, an image is assigned several labels, including a single lowest-level taxonomic unit (species), as well as the ancestor ones (family). Since Convolutional Neural Networks are state- of the art in image classification, we apply them to this problem. Since we have a relatively small set of diatom images, we apply the paradigm of transfer learning and use an ImageNet pre-trained InceptionV3 model. We explore two avenues of transfer, one of which is commonly applied, namely to freeze some layers of the pre-trained network and allow for fine-tuning of the unfrozen layers with diatom images. We use one output neuron for each of the leaf nodes in the taxonomy. The second avenue we explore is to use the features extracted by the ImageNet pre-trained InceptionV3 model and train a tree-ensemble classifier. In particular, we use ensembles of predictive clustering trees  for hierarchical multi-label classification (PCTs for HMC). We compare our results with earlier work on the task at hand. This includes the use of ensembles of PCTs for HMC on hand-crafted features extracted from the diatom images, as well as features extracted by scale-invariant feature transforms. The transfer learning approach of fine-tuning the ImageNet pre-trained CNN achieves excellent predictive performance.
Hierarchical multi-label classification, Diatoms, Transfer learning, Convolutional neural networks, Feature extraction, Predictive clustering trees, Tree ensembles