Computer Vision AI With Yolo-v8

Redmen Ishab
5 min readJan 24, 2024
Computer Vision and object detection

Story of My life 🎵 :

Me and my team (one other person 😁) were full time (40 hrs/ week) allocated to a POC task that required the computing machine to identify the shape, size and count of the jewelry items in the sketch (hand-drawn). Let me tell you when you are from a development background and never been involved in any sorts of AI or ML tasks (not even in college days 😑, it’s coming, wait for it, “not even worked on python” 💥 )and all of sudden you are full time assigned (with deadline) with an expectation of a result that could bring in a potential clients, its like you are in the stage, you are on the spotlight and you have absolutely no idea why are you even there let alone the thing you should be doing (laugh about it now but it was all different).

AI engineers

Anyways, A part time allocated (10 hrs/ week 🙈) AI engineer, responsible for making decisions on tools and validating the results, locked into the decision of using YOLO (You Only Look Once) model for detection and segmentation. So, With a deadline time in just a week, We started to get our hands dirty, downloaded the jupyter notebook (never used it and didn’t know how it worked 😭) and started running the script one by one (I will discuss the approach we used for the task later in the blog in further detail).

Technical Detail 🖥️ :

Background

Yolo v8 represents the latest iteration in the ‘You Only Look Once’ series of models, offering enhanced accuracy and speed in image detection tasks. This document will specifically focus on its application in detecting and classifying gem shapes in images, a task that has significant implications in jewelry design area. (But the steps ideally works for any detection)

Purpose and Objective

The primary goal is to accurately detect, classify, and count gems in a given image. Specifically, the model should be able to identify:

- Round-shaped gems
- Oval-shaped gems
- Pear-shaped gems

This will enable users to quickly assess the variety and quantity of gems in an image, which is particularly useful in inventory management and design planning.

Tools

1. Yolo v8: An advanced deep learning algorithm used for object detection. This guide uses Yolo v8 to identify various gem shapes in images. (Download the jupyter notebook)
2. Roboflow: A tool that aids in the organization and annotation of image datasets, critical for training machine learning models like Yolo v8.

Steps

Steps invloved in Object Detection

1. Acquire Image: Obtain high-quality images of gems, ensuring a variety of shapes (that needs to be detected) are represented. Images should be clear, well-lit, and have minimal background noise. The greater the number of images on which model can be trained the better the output is( but we only had 25 images 😭 ).

2. Data Preparation: Utilize Roboflow to annotate/ label the images, marking the different gem shapes with their respective class (I have pear, round and oval). This step is crucial for training the Yolo v8 model to recognize and differentiate between the shapes. Ideally, Among all of the steps this is going to be the most easy and yet boring.

3. Model Training: Train the Yolo v8 model using the annotated dataset. This involves setting parameters, choosing the appropriate algorithm configuration, and running the training process. (The path to model from the yaml file surprisingly was throwing error because of relative path 🤷 )

!yolo task=detect mode=train model=yolov8s.pt data=[PATH_TO_DATASET]/data.yaml epochs=30 imgsz=640 plots=True

Find more on the parameters from the documentation

4. Model Testing and Validation: After training, test the model’s accuracy on a separate set of images not used in training. This step is crucial to evaluate the model’s performance in real-world scenarios. Check for the confusion matrix (an ideal matrix has more data / heat on top to bottom diagonal) and results.png. Find more on performance metrics )

Confusion matrix

5. Inference: Once validated, use the model to detect the items using the trained model. The output of will be as follows:

!yolo task=detect mode=predict model={HOME}/runs/classify/train/weights/best.pt conf=0.25 source={dataset.location}/test/
Inferred result

Thus, If you have followed each steps using your resources for annotation, training and inferring (due to privacy policy I couldn’t share the resources 😔), you have finally completed steps to object detection. You can now finally deploy the model, create api for message passing and integrate into your application (if I receive request then I’d happily create another blog for that too).

Advanced Configuration and Optimization

After the initial training, you might need to fine-tune the model for optimal performance. This involves:

  • Hyperparameter Tuning: Adjusting learning rate, batch size, and other parameters for better accuracy.
  • Data Augmentation: Introducing variations in the dataset, like rotations or flip, to make the model more robust.
  • Data Preprocessing: Introducing variations in the dataset like gray scale, color adjustments, to make the model more effective.

Troubleshooting

Common issues and their resolutions:

  1. Low Detection Accuracy: If the model isn’t accurately identifying gem shapes, consider increasing the dataset size or improving image quality. Also we noticed the epoch size significantly improves the model effectiveness (epoch 50 is currently used)
  2. Overfitting: If the model performs well on training data but poorly on new images, try reducing the complexity of the model or increasing the diversity of the training dataset.

#computerVision #yolov8 #detection

--

--

Redmen Ishab

Software Engineer. Failed startup “Software Factory”. Working experience as CTO, SSE, Full Stack Dev, Mobile App Engineer, and Lecturer.