Datasets for Computer Vision (4)-Python Tutorial-php.cn

Datasets for Computer Vision (4)

Barbara Streisand

Release： 2024-12-09 19:43:17

Original

308 people have browsed it

Buy Me a Coffee☕

*Memos:

My post explains MNIST, EMNIST, QMNIST, ETLCDB, Kuzushiji and Moving MNIST.
My post explains Fashion-MNIST, Caltech 101, Caltech 256, CelebA, CIFAR-10 and CIFAR-100.
My post explains Oxford-IIIT Pet, Oxford 102 Flower, Stanford Cars, Places365, Flickr8k and Flickr30k.

(1) ImageNet(2009):

has the 1,331,167 object images(1,281,167 for train and 50,000 for validation) each connected to the label from 1000 classes: *Memos:
- Each class has the one or more names which represent the same things.
- You can download ILSVRC2012_devkit_t12.tar.gz, ILSVRC2012_img_train.tar and ILSVRC2012_img_val.tar.
is ImageNet() in PyTorch.

Datasets for Computer Vision (4)

(2) LSUN(Large-scale Scene Understanding)(2015):

has scene images and there are the 10 datasets Bedroom, Bridge, Church Outdoor, Classroom, Conference Room, Dining Room, Kitchen, Living Room, Restaurant and Tower:
- Bedroom has 3,033,342 bedroom images(3,033,042 for train and 300 for validation).
- Bridge has 818,987 bridge images(818,687 for train and 300 for validation).
- Church Outdoor has 126,527 church outdoor images(126,227 for train and 300 for validation).
- Classroom has 126,527 classroom images(126,227 for train and 300 for validation).
- Conference Room has 229,369 conference room images(229,069 for train and 300 for validation).
- Dining Room has 657,871 dining room images(657,571 for train and 300 for validation).
- Kitchen has 2,212,577 kitchen images(2,212,277 for train and 300 for validation).
- Living Room has 1,316,102 living room images(1,315,802 for train and 300 for validation).
- Restaurant has 626,631 restaurant images(626,331 for train and 300 for validation).
- Tower has 708,564 tower images(708,264 for train and 300 for validation).
is LSUN() in PyTorch but it has the bug.

Datasets for Computer Vision (4)

(3) MS COCO(Microsoft Common Objects in Context)(2014):

has object images with annotations and there are the 16 datasets 2014 Train images and 2014 Val images with 2014 Train/Val annotations, 2014 Test images with 2014 Testing Image info, 2015 Test images with 2015 Testing Image info, 2017 Train images and 2017 Val images with 2017 Train/Val annotations, 2017 Stuff Train/Val annotations or 2017 Panoptic Train/Val annotations, 2017 Test images with 2017 Testing Image info and 2017 Unlabeled images with 2017 Unlabeled Image info: *Memos:
- 2014 Train images has 82,782 images.
- 2014 Val images has 40,504 images.
- 2014 Train/Val annotations has 123,286 annotations(82,782 for train and 40,504 for validation) for 2014 Train images and 2014 Val images.
- 2014 Test images has 40,775 images.
- 2014 Testing Image info has 40,775 annotations for 2014 Test images.
- 2015 Test images has 81,434 images.
- 2015 Testing Image info has 81,434 annotations for 2015 Test images.
- 2017 Train images has 118,287 images.
- 2017 Val images has 5,000 images.
- 2017 Train/Val annotations has 123,287 annotations(118,287 for train and 5,000 for validation) for 2017 Train images and 2017 Val images.
- 2017 Stuff Train/Val annotations has 123,287 annotations(118,287 for train and 5,000 for validation) for 2017 Train images and 2017 Val images.
- 2017 Panoptic Train/Val annotations has 123,287 annotations(118,287 for train and 5,000 for validation) for 2017 Train images and 2017 Val images.
- 2017 Test images has 40,670 images.
- 2017 Testing Image info has 40,670 annotations for 2017 Test images.
- 2017 Unlabeled images has 123,403 images.
- 2017 Unlabeled Image info has 123,403 annotations for 2017 Unlabeled images.
is also called just COCO.
is CocoDetection() or CocoCaptions()

Datasets for Computer Vision (4)

The above is the detailed content of Datasets for Computer Vision (4). For more information, please follow other related articles on the PHP Chinese website!