Towards a Taxonomy Machine. A Training Set of 5.6 Million Arthropod Images

Avatar
Poster
Voice is AI-generated
Connected to paperThis paper is a preprint and has not been certified by peer review

Towards a Taxonomy Machine. A Training Set of 5.6 Million Arthropod Images

Authors

Steinke, D.; Ratnasingham, S.; Agda, J.; Ait Boutou, H.; Box, I.; Boyle, M.; Chan, D.; Feng, C.; Lowe, S. C.; McKeown, J. T.; McLeod, J.; Sanchez, A.; Smith, I.; Walker, S.; Wei, C. Y.-Y.; Hebert, P. D. N.

Abstract

The taxonomic identification of organisms from images is an active research area within the machine learning community. Current algorithms are very effective for object recognition and discrimination, but they require extensive training datasets to generate reliable assignments. This study releases 5.6 million images with representatives from 10 arthropod classes and 26 insect orders. All images were taken using a Keyence VHX-7000 Digital Microscope system with an automatic stage to permit high-resolution (4K) microphotography. Providing phenotypic data for 324,000 species derived from 48 countries, this release represents, by far, the largest dataset of standardized arthropod images. As such, this dataset is well suited for testing the efficacy of machine learning algorithms for identifying specimens to higher taxonomic categories.

Follow Us on

0 comments

Add comment