Surface Defect Detection: Dataset & Papers

Introduction

*** Star ⭐ ~ ***

Star anti-lost

❤️

1. Key Issues in Surface Defect Detection

1Small Sample Problem

- Data Amplification and Generation

- Network Pre-training and Transfer Learning

- Reasonable Network Structure Design

- Unsupervised or Semi-supervised Method

In the unsupervised model, only normal samples are used for training, so there is no need for defective samples. The semi-supervised method can use unlabeled samples to solve the network training problem in the case of small samples.

BACK to Table of Contents -->

2Real-time Problem

BACK to Table of Contents -->

2. Common Datasets for Industrial Surface Defect Detection

1Steel Surface: NEU-CLS

NEU-CLS can be used for classification and positioning tasks.

❌ Official Linkhttp://faculty.neu.edu.cn/yunhyan/NEU_surface_defect_database.html

latest access - (#16)

Kaggle - Severstal: Steel Defect Detection

Severstal is leading the charge in efficient steel mining and production. They believe the future of metallurgy requires development across the economic, ecological, and social aspects of the industryand they take corporate responsibility seriously. The company recently created the countrys largest industrial data lake, with petabytes of data that were previously discarded. Severstal is now looking to machine learning to improve automation, increase efficiency, and maintain high quality in their production.

https://www.kaggle.com/c/severstal-steel-defect-detection

BACK to Table of Contents -->

2Solar Panels: elpv-dataset

linkhttps://github.com/zae-bayern/elpv-dataset

The dataset contains 2,624 samples of 300x300 pixels 8-bit grayscale images of functional and defective solar cells with varying degree of degradations extracted from 44 different solar modules. The defects in the annotated images are either of intrinsic or extrinsic type and are known to reduce the power efficiency of solar modules.

All images are normalized with respect to size and perspective. Additionally, any distortion induced by the camera lens used to capture the EL images was eliminated prior to solar cell extraction.

BACK to Table of Contents -->

3Metal Surface: KolektorSDD

The dataset is constructed from images of defected electrical commutators that were provided and annotated by Kolektor Group. Specifically, microscopic fractions or cracks were observed on the surface of the plastic embedding in electrical commutators. The surface area of each commutator was captured in eight non-overlapping images. The images were captured in a controlled environment.

Official Link:https://www.vicos.si/Downloads/KolektorSDD
Download Linkhttps://pan.baidu.com/share/init?surl=HSzHC1ltHvt1hSJh_IY4Jg (password1zlb)
Implementation https://github.com/skokec/segdec-net-jim2019

The dataset consists of:

50 physical items (defected electrical commutators)
8 surfaces per item
Altogether 399 images:
-- 52 images of visible defect
-- 347 images without any defect
Original images of sizes:
-- width: 500 px
-- height: from 1240 to 1270 px
For training and evaluation images should be resized to 512 x 1408 px

For each item the defect is only visible in at least one image, while two items have defects on two images, which means there were 52 images where the defects are visible. The remaining 347 images serve as negative examples with non-defective surfaces.

BACK to Table of Contents -->

4PCB Inspection: DeepPCB

Download Linkhttps://github.com/Charmve/Surface-Defect-Detection/tree/master/DeepPCB

BACK to Table of Contents -->

5Fabric Defects Dataset: AITEX

Download Linkhttps://pan.baidu.com/s/1cfC4Ll5QlnwN5RTuSZ6b7w (passwordb9uy)

This dataset consists of 245 4096x256 pixel images with seven different fabric structures. There are 140 non-defect images in the dataset, 20 of each type of fabric. In addition, there are 105 images of different types of fabric defects (12 types) common in the textile industry. The image size allows users to use different window sizes, thereby the number of samples can be increased. The online dataset also contains segmentation masks of all defective images, so that white pixels represent defective areas and the remaining pixels are black.

BACK to Table of Contents -->

6Fabric Defect Dataset (Tianchi)

Download Linkhttps://pan.baidu.com/s/1LMbujxvr5iB3SwjFGYHspA (passwordgat2)

In the actual production process of cloth, due to the influence of various factors, defects such as stains, holes, lint, etc. will occur. In order to ensure the quality of the product, the cloth needs to be inspected for defects.

Fabric defect inspection is an important part of the textile industry's production and quality management. At present, manual inspection is susceptible to subjective factors and lacks consistency, and inspection personnel working for a long time under strong light has a great impact on vision. Due to the wide variety of fabric defects, various morphological changes, and the difficulty of observation and recognition, the intelligent detection of fabric defects has been a technical bottleneck that has plagued the industry for many years.

This dataset covers all kinds of important defects in fabrics in the textile industry, and each picture contains one or more defects. The data includes two types of plain cloth and patterned cloth. Among them, about 8000 pieces of plain cloth data are used for preliminary matches, and about 12,000 pieces of patterned cloth data are used for semi-finals.

BACK to Table of Contents -->

7Aluminium Profile Surface Defect DatasetTianchi

Download Linkhttps://tianchi.aliyun.com/competition/entrance/231682/information

Due to the influence of various factors in the actual production process of aluminum profile, the surface of the aluminum profile will have cracks, peeling, scratches and other defects, which will seriously affect the quality of the aluminum profile. To ensure product quality, manual visual inspection is required. However, the surface of the aluminum profile itself contains textures, which are not highly distinguishable from defects.

Traditional manual visual inspection methods have many shortcomings, which are very laborious, cannot accurately judge surface defects in time, and have difficult to control the efficiency of quality inspection. In recent years, deep learning has made rapid progress in image recognition and other fields. Aluminum profile manufacturers are eager to use the latest AI technology to innovate the existing quality inspection process, automatically complete quality inspection tasks, reduce the incidence of missed inspections, and improve product quality. AI technology, especially deep learning, makes aluminum profile product production managers completely free from the inability to fully grasp the state of product surface quality.

In the dataset of the competition, there are 10,000 pieces of monitoring image data from aluminum profiles with defects in actual production, and each image contains one or more defects. The sample image for machine learning will clearly identify the type of defect contained in the image.

BACK to Table of Contents -->

8Weakly Supervised Learning for Industrial Optical InspectionDAGM 2007

Download Linkhttps://hci.iwr.uni-heidelberg.de/node/3616

Dataset introduction:

Mainly aimed at miscellaneous defects on textured backgrounds.
Training data with weaker supervision.
Contains ten data sets, the first six are training data sets, and the last four are test data sets.
Each dataset contains 1000 "non-defective" images and 150 "defective" images saved in grayscale 8-bit PNG format. Each data set is generated by a different texture model and defect model.
The background texture of the "No Defect" image shows no defect, and the background texture of the "No Defect" image has exactly one marked defect.
All datasets have been randomly divided into training and testing sub-data sets of equal size.
Weak labels are represented by ellipses, which roughly indicate the defect area.

BACK to Table of Contents -->

9Cracks on the Surface of the Construction

CrackForest Dataset is an annotated road crack image database which can reflect urban road surface condition in general.

Github Linkhttps://github.com/cuilimeng/CrackForest-dataset
Download linkhttps://pan.baidu.com/s/1108j5QbDr7T3XQvDxAzVpg (passwordjajn)

Bridge cracks. There are 2688 images of bridge crack without pixel-level ground truth. From the authors "Liangfu Li, Weifei Ma, Li Li, Xiaoxiao Gao". Files can be reached by visiting https://github.com/Charmve/Surface-Defect-Detection/tree/master/Bridge_Crack_Image.
Crack on road surface. From Shi Yong, and Cui Limeng and Qi Zhiquan and Meng Fan and Chen Zhensong. Original dataset can be reached at https://github.com/Charmve/Surface-Defect-Detection/tree/master/CrackForest. We extract the image files of the pixel level ground truth.

BACK to Table of Contents -->

10Magnetic Tile Dataset

Magnetic tile dataset by githuber: abin24, which can be downloaded from https://github.com/Charmve/Surface-Defect-Detection/tree/master/Magnetic-Tile-Defect, which was used in their paper "Surface defect saliency of magnetic tile", the paper can be reach by here or here

This is also the datasets of the paper "Saliency of magnetic tile surface defects". The images of 6 common magnetic tile defects were collected, and their pixel level ground-truth were labeled.

BACK to Table of Contents -->

11RSDDs: Rail Surface Defect Datasets

The RSDDs dataset contains two types of datasets: the first is a type I RSDDs dataset captured from the fast lane, which contains 67 challenging images. The second is a Type II RSDDs dataset captured from a normal/heavy transportation track, which contains 128 challenging images.

Each image of the two data sets contains at least one defect, and the background is complex and noisy.

These defects in the RSDDs dataset have been marked by professional human observers in the field of track surface inspection.

Official Linkhttp://icn.bjtu.edu.cn/Visint/resources/RSDDs.aspx
Download Linkhttps://pan.baidu.com/share/init?surl=svsnqL0r1kasVDNjppkEwg (passwordnanr)

BACK to Table of Contents -->

12Kylberg Texture Dataset v.1.0

Short description

28 texture classes, see Figure 4.
160 unique texture patches per class. (Alternative dataset with 12 rotations per each original patch, 160*12=1920 texture patches per class).
Texture patch size: 576x576 pixels.
File format: Lossless compressed 8 bit PNG.
All patches are normalized with a mean value of 127 and a standard deviation of 40.
One directory per texture class.
Files are named as follows: blanket1-d-p011-r180.png, where blanket1 is the class name, d original image sample number (possible values are a, b, c, or d), p011 is patch number 11, r180 patch rotated 180 degrees.

Offical Link: http://www.cb.uu.se/~gustaf/texture/

BACK to Table of Contents -->

13KTH-TIPS database

Repeat the background texture data set, the sample picture is as follows

Offical Link:https://www.nada.kth.se/cvap/databases/kth-tips/download.html
Download Link
- dataset1https://pan.baidu.com/s/173h8V66yRmtVo5rc2P7J4A
- dataset2https://pan.baidu.com/s/1dXFKn6v2PV5QS9m8gWlifA

BACK to Table of Contents -->

14Escalator Step Defect Dataset

Offical Linkhttps://aistudio.baidu.com/aistudio/datasetdetail/44820

BACK to Table of Contents -->

15Transmission Line Insulator Dataset

In the data set, Normal_Insulators contains 600 insulator images captured by drones. Defective_Insulators contains defective insulators, and the number of defective images of insulators is 248. The data set includes data sets and labels.

Offical Linkhttps://github.com/InsulatorData/InsulatorDataSet

BACK to Table of Contents -->

16MVTEC ITODD

The MVTec Industrial 3D Object Detection Dataset (MVTec ITODD) is a public dataset for 3D object detection and pose estimation with a strong focus on industrial settings and applications.

The dataset consists of

28 objects and 3500 labeled scenes containing instances of these objects
Five sensors (two 3D sensors and three grayscale cameras) observing each scene

More information can be found in this PDF file .

Download link https://www.mvtec.com/company/research/datasets/mvtec-itodd

BACK to Table of Contents -->

17BSData - dataset for Instance Segmentation and industrial Wear Forecasting

The dataset contains 1104 channel 3 images with 394 image-annotations for the surface damage type pitting. The annotations made with the annotation tool labelme, are available in JSON format and hence convertible to VOC and COCO format. All images come from two BSD types.

The other BSD type is shown on 325 images with two image-sizes. Since all images of this type have been taken with continuous time the degree of soiling is evolving.

Also, the dataset contains as above mentioned 27 pitting development sequences with every 69 images.

Offical link https://github.com/2Obe/BSData

Sincerely, thank @Beat Gartzia for his recommendation and all your attention!

BACK to Table of Contents -->

18The Gear Inspection Dataset

The Gear Inspection Dataset (GID) is a dataset for a competition held by Baidu (China) Co., called the "National Artificial Intelligence Innovation Application Competition." It has two thousand grayscale images with 28575 annotations for three types of defects from a real-world source. Each picture includes defects described in a separate JSON file with the image name, label categories, bounding boxes, and polygons for segmentation. Nevertheless, the tags for labeling categories do not include specific information about their type but only numbers, so spotting their similarities with other related datasets is challenging.

Offical link http://www.aiinnovation.com.cn/#/dataDetail?id=34

Download Link
- Gear Detection Training Dataset: https://pan.baidu.com/s/17HoFfBUQGeX7G0ibkPExrw (passwprd: hm7k)
- Gear detection A list evaluation dataset: https://pan.baidu.com/s/157Zf7hcTM78GhXtXI5ySFQ (pass: 2R6K)
- Gear detection B list evaluation dataset: https://pan.baidu.com/s/1OjOZotqlRSvsYLA_qH2nXA (pass: hypd)
Mirrors:
- Gear Detection Training Dataset: https://drive.google.com/file/d/1CZo-Ab5BXkTjV-b1-NIFzYMjfJQMl4nG/view?usp=share_link
- Gear detection A list evaluation dataset: https://drive.google.com/file/d/1-0sSrmhElBseeZWICu77lzTxoOiRD8yG/view?usp=share_link
- Gear detection B list evaluation dataset: N/A.

Note: The contest dataset is not for commercial use.

BACK to Table of Contents -->

19AeBAD Aircraft Engine Blade Anomaly Detection

Download link: http://suo.nz/2IU48P

The real-world aero-engine blade anomaly detection (AeBAD) data set consists of two sub-data sets: the single blade data set (AeBAD-S) and the blade video anomaly detection data set (AeBAD-V). Compared with existing datasets, AeBAD has the following two characteristics: 1.) The target samples are not aligned and at different scales. 2.) There is a domain shift in the distribution of normal samples in the test set and training set, where the domain shift is mainly caused by changes in illumination and view.

BACK to Table of Contents -->

20BeanTech Anomaly Detection(BTAD)

Download Linkhttp://suo.nz/2JEGEi

The BTAD (BeanTech Anomaly Detection) dataset is a real-world industrial anomaly dataset. This dataset contains a total of 2830 real-world images of 3 industrial products.

BACK to Table of Contents -->

3. More Inventory of the Best Data Set Sources

I have been collecting surface defect detection data sets, but there are still many data sets that have not been collected. For the data sets not collected in this repo, you can go to the following sites to view. At the same time, everyone is very welcome to share the new data set and become the contributor of this repo.

source	url	Recommended
Kaggle	https://www.kaggle.com/datasets
Paper With Code	https://paperwithcode.com/sota
Registry of Open Data on AWS	https://registry.opendata.aws
Microsoft Research Open Data	https://msropendata.com
Awesome-public-datasets	https://github.com/awesomedata/awesome-public-datasets

BACK to Table of Contents -->

4. Surface Defect Detection Papers

I have collected some articles on surface defect detection. The main objects to be tested are: defects or abnormal objects such as metal surfaces, LCD screens, buildings, and power lines. The methods are mainly classified method, detection method, reconstruction method and generation method. The electronic version (PDF) of the paper is placed under the file named corresponding to the date in the 'Paper' folder.

Go to 📂 [Papers].

BACK to Table of Contents -->

Acknowledgements

BACK to Table of Contents -->

Download

Download ZIP, click here
or run git clone https://github.com/Charmve/Surface-Defect-Detection.git in the terminal
Chinese Mainland - https://pan.baidu.com/s/122WY8F5VKqm3qMirqebRQw :i20n

BACK to Table of Contents -->

Notification

This work is originally contributed by lots of great man for their paper work or industry application. You can only use this dataset for research purpose.

If you have any questions or idea, please let me know 📧 [email protected]

Community

Github discussions or issues
QQ Group: 734758251 (password)
WeChat Group ID: Yida_Zhang2
Email: yidazhang1#gmail.com

Supporting

Support this project by becoming a sponsor. Your name and/or logo will show up our homepage with a link to your website.

Citation

Use this bibtex to cite this repository:

@misc{Surface Defect Detection,
  title={Surface Defect Detection: Dataset and Papers},
  author={Charmve},
  year={2020.09},
  publisher={Github},
  journal={GitHub repository},
  howpublished={\url{https://github.com/Charmve/Surface-Defect-Detection}},
}