Web proceedings papers

Authors

Andreja Naumoski , Georgina Mirceva and Kosta Mitreski

Abstract

Knowledge discovery from environmental data aims to understand the measured ecological data underlying patterns and to provide possible prediction of future events. Besides the environmental data at hand, this process must in- clude a machine learning algorithm that will produce an accurate and human rea- sonable model. Both ecologists and decision-makers agree on this. That’s why in this paper, we investigate the influence of the parameter tuining on a predictive tree decision-making algorithm, the predictive clustering trees, in the process of obtaining biodiversity models when using different number of discretization lev- els for the target attribute. Here, the biodiversity index is the target attribute, which is calculated from the diatoms’ abundances and it is discretized. For build- ing model, we use a decision tree machine learning algorithm that produces rel- atively accurate models and most important the models are very easy to interpret by the biologists, who do not need to be fimiliar with the inner working of the algorithm itself. Besides the experimental evaluation of the models’ performance and statistical significance of the results, some of the obtained models are also presented and discussed.

Keywords

Ecological modeling, Biodiversity indices, Diatoms, Machine learning algorithm