📄

アフリカの野生動物の観測にDeep Learningを利用 – Automatically identifying wild animals in camera-trap images with deep learning

Entry

アフリカの野生動物の観測にDeep Learningを利用 – Automatically identifying wild animals in camera-trap images with deep learning

Simple Title

Automatically identifying wild animals in camera-trap images with deep learning

Description

アフリカの野生動物の観測にDeep Learningを利用 – Automatically identifying wild animals in camera-trap images with deep learning

Type
Paper
Year

2017

Posted at
October 25, 2017
Tags
cross-modal
image

Abstract

Having accurate, detailed, and up-to-date information about wildlife location and behavior across broad geographic areas would revo-lutionize our ability to study, conserve, and manage species and ecosystems. Currently, such data are mostly gathered manually at great expense, and thus are sparsely and infrequently collected. Here we investigate the ability to automatically, accurately, and inex-pensively collect such data, which could transform many fields of biology, ecology, and zoology into ” big data ” sciences. Motion sen-sor cameras called ” camera traps ” enable pictures of wildlife to be collected inexpensively, unobtrusively, and at high-volume. However, identifying the animals, animal attributes, and behaviors in these pic-tures remains an expensive, time-consuming, manual task often per-formed by researchers, hired technicians, or crowdsourced teams of human volunteers. In this paper, we demonstrate that such data can be automatically extracted by deep neural networks (aka deep learning), which is a cutting-edge type of artificial intelligence. In particular, we use the existing human-labeled, single-animal images from the Snapshot Serengeti dataset to train deep convolutional neu-ral networks for identifying 48 species in 3.2 million images taken from Tanzania’s Serengeti National Park. In this paper we train neural networks that automatically identify animals with over 92% accuracy, and we expect that number to improve rapidly in years to come. More importantly, we can choose to have our system classify only the im-ages it is highly confident about, allowing valuable human time to be focused only on challenging images. In this case, our system can automate animal identification for 96.9% of the data while still per-forming at the same 96.6% accuracy level of crowdsourced teams of human volunteers, saving approximately ∼8.2 years (at 40 hours per week) of human labeling effort (i.e. over 17,000 hours) on a 3.2-million-image dataset. Those efficiency gains immediately highlight the importance of using deep neural networks to automate data ex-traction from camera-trap images. The improvements in accuracy we expect in years to come suggest that this technology could enable the inexpensive, unobtrusive, high-volume and perhaps even real-time collection of information about vast numbers of animals in the wild.

Motivation

近年、アフリカの野生動物の生態の調査のためにカメラトラップ(モーションセンサーや赤外線センサーを用いて自動で撮影する設置型のカメラ)が広く使われるようになっています。

これまでは、不鮮明だったり動物の一部しか写っていない写真を人手でラベル付けしていくのに、たいへんな労力がかかっていました。そこに畳み込みニューラルネットワークに基づく画像認識を用いようという研究です。

Results

image

カメラトラップの例

学習用のデータとしては、タンザニアのセレンゲティ国立公園に設置されたカメラトラップで撮影された320万枚の写真を集めたSnapshot Serengeti Datasetが使われています.  (クラウドソースで個別の動物のラベル付けがなされています)

image

データセットの写真の例

詳細は省きますが、Deep Learningの導入によって、320万枚の写真を人手で識別するのにかかる時間に比べて、同等の制度を保ちつつ1700時間以上の時間を短縮できることがわかったとされてます!! 

Further Thoughts

今後、Deep Learningがさまざまな研究領域の調査により広く使われていることでしょう。

技術的に取り立てて新しいわけではないのですが、利用目的が面白かったので紹介しました!😀

Links