Tool and Service
Using Microsoft Custom Vision AI for training and testing
Type of Custom Vision AI: Object Detection with general category
What does Custom Vision Service do well?
The Custom Vision Service works best when the item you're trying to classify is prominent in your image.
To start the training model, the custom vision requires at least 15 images per class, 50 images per class are enough to start the prototype. In this project, we created 4 classes with around 100 images per class.
Custom Vision Service accepts training images in .jpg, .png, and .bmp format, up to 6 MB per image. (Prediction images can be up to 4 MB per image.) We recommend that images be 256 pixels on the shortest edge. Any images shorter than 256 pixels on the shortest edge are scaled up by Custom Vision Service.
With an image, we can have multiclass objects in.
Train datasets
To demo, we have 4 products for training and testing
Class |
Train Image quantity |
Test image quantity |
Split Ratio |
Type |
Bia 333 |
111 |
22 |
80/20 |
JPEG |
Bia Heineken |
108 |
22 |
80/20 |
JPEG |
Dau an Neptune |
104 |
21 |
80/20 |
JPEG |
Dau an Simply |
106 |
21 |
80/20 |
JPEG |
The input parameters
Threshold Type |
Threshold Value |
The meaning of threshold |
Probability Threshold |
50% |
This 50% is the average accuracy score that gives the balance result of Precision and Recall values |
Overlap Threshold |
30% |
This is calculated based on the regions of custom vision suggest and user’s drawing, if we increase this value, it may not exclude more image, then the Precision and Recall values will be decreased |
Duration of training: 6 minutes
To validate the result, Microsoft Custom Vision AI is using a process called k-fold cross validation.
Result of training
The result after training with the above datasets
Test and retrain a model
Test an image
From the test result, we can change the classified result to use for the next train, it will make the model more accuracy.
How to improve the train dataset
- First-round training
- Add more images and balance data
- Retrain
- Add images with varying background, lighting, object size, camera angle, and style
- Retrain & feed in image for prediction
- Examine prediction results
- Modify existing training data
Train Dataset Suggestion
Base on the type of object, we should choose the object that is easy to recognize what is that object. With above datasets, we suggest focusing on the logo of product, that would be used to distinguish the difference among equivalent objects
End-to-End solution
This is a general proposed solution with custom vision service.
Demo
Reference