Small and Large Stones of Programming: Computer Vision Archives

Converting Caltech pedestrian dataset for Python

Recently, a href="http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/" target="_blank">Caltech pedestrian dataset is often used as a benchmark for Computer Vision. However, this dataset is in an extraordinary format, and so it is not easy to handle it. You can handle it easier by using Matlab, but it is troublesome if you intend to convert it for Python for the sake of, for example, deep learning. I developed conversion tools, so I publish them here.

Continue...

Created on June 24, 2015 10:34 PM | Updated on April 6, 2016 12:52 AM | Individual link | Comments (0) | TrackBacks (0)

Method for packing 8-bit (int8) arrays into GPU memory by Theano

GPU memory is limited. Although NVIDIA TITAN X, which has 12-GB memory, is expensive, sometimes 12 GB is not sufficient. GPU memory is smaller for subordinate GPU models. A numerical value that can be stored in one byte can be stored as an 8-bit integer in GPU memory, and thus the GPU memory is more effectively used. So this article describes a method to do so.

Continue...

Created on November 20, 2015 7:01 PM | Updated on November 20, 2015 10:58 PM | Individual link | Comments (0) | TrackBacks (0)

MPEG video file generation from Caltech dataset

The Caltech pedestrian dataset contains a special format videos. The following program converts them to normal video format files. The extension of the generated file is ".avi", but ".mpg" can also be used (can be rewritten).

Continue...

Created on January 26, 2016 12:52 AM | Updated on January 26, 2016 12:57 AM | Individual link | Comments (0) | TrackBacks (0)

Small and Large Stones of Programming

Computer Vision Archives

June 24, 2015

Converting Caltech pedestrian dataset for Python

November 20, 2015

Method for packing 8-bit (int8) arrays into GPU memory by Theano

January 26, 2016

MPEG video file generation from Caltech dataset

Search

About Computer Vision