Web proceedings papers

Authors

Bane Georgievski , Veno Pachovski and Biljana Stojcevska

Abstract

This paper demonstrates the implementation of an Erlang based web crawler working on top of a NoSQL RIAK database for the purpose of storing and analyzing images on the web. By generating a unique hash of each image using a perceptual hashing algorithm, we are able to determine the similarity between the images thus reducing the chance of duplicates and enabling an image based search.

Keywords

Perceptual hashing, image analysis, data storage, search optimization, big-data, NoSql.