This paper demonstrates the implementation of an Erlang based web crawler working on top of a NoSQL RIAK database for the purpose of storing and analyzing images on the web. By generating a unique hash of each image using a perceptual hashing algorithm, we are able to determine the similarity between the images thus reducing the chance of duplicates and enabling an image based search.
Perceptual hashing, image analysis, data storage, search optimization, big-data, NoSql.