top of page

Computer Vision

Preparing Data for YOLO Training: Data Annotation Techniques and Best Practices

In this post, we'll guide you through the process of preparing annotated data for YOLO model training, from labeling to organizing your data

4

min

Mahmoud_edited.jpg

Admon W.

You now have a basic understanding of the YOLO object detection algorithm and might be eager to try it out in your own projects.

The key to success is a customized training dataset.

Tailored datasets are crucial for developing high-precision, efficient YOLO models that cater to your specific use case. By annotating your own data, you ensure that the model learns to recognize objects relevant to your domain, whether it's detecting vehicles on the road, identifying products on a conveyor belt, or spotting safety hazards at a construction site.

Preparing Data for YOLO Training: Data Annotation Techniques and Best Practices

In this post, we'll guide you through the process of preparing annotated data for training your YOLO model, from labeling objects in images to organizing your dataset.

Data Preparation for YOLO v9 Training

Remember, a well-prepared annotated dataset not only enhances your model's performance but also reduces the time and resources needed for training. The data preparation process can be divided into four steps:

  • Data Collection: Gather a large, diverse dataset of images that represent all the classes you want your model to detect. You can use public datasets like COCO and Pascal VOC, or collect your own custom data.

  • Data Annotation: Each image needs YOLO format annotation, including the class and location (usually a bounding box) of each object. Annotation accuracy directly impacts model performance.

  • Annotation Format Conversion: YOLO requires annotations in a specific format. Each image has a .txt file listing all objects with their class and bounding box info. Bounding boxes are formatted as:

<object-class> <x_center> <y_center> <width> <height>

The coordinates are normalized relative to image dimensions. <object-class> is the class index.

  • Dataset Splitting: Split the dataset into train, validation, and test sets. This is critical to avoid overfitting and evaluate model performance. A typical split is 70% training, 15% validation, and 15% test.

Data Annotation for YOLO v9

Now, let's walk through the data annotation process to prepare a dataset for YOLO training.

First, choose an annotation tool. Both open-source and cloud-based tools can work, but online versions tend to be more efficient for teams.

We'll use BasicAI Cloud as an example, a popular choice for object detection research.

No installation is needed; just sign up for a free account at https://app.basic.ai.*

A dataset for turtle detection

We've collected a dataset for turtle detection. Without annotations, the model can't learn, so let's start annotating.

Uploading Data

On the BasicAI Cloud* UI, go to "Datasets," click "+Create," select the "Image" type, name your dataset, and click "Create."

Step1: Uploading Data

In the preview interface, click the blue "+Upload" button. You can upload via local files, URLs, or cloud storage. Here, we upload it from a local address.

Step 1: Data Uploaded

Creating an Ontology

Let's create a "Turtle" ontology class.

Go to the "Ontology" tab and click "+Create". Choose the Bounding Box type, name it, and set the box color.

Step 2: Creating an Ontology

Annotating Data

Back on the "Data" tab, select all data and click "Annotate".

Step 3: Annotating Data

Annotation tools are on the left, and classes are on the right.

Step 3: Annotating Data

Select the "Bounding Box Tool" (shortcut '1'). The cursor becomes a crosshair.

💡 Tips: Pre-select the class to assign it to new boxes automatically. Great for multi-object detection.

Click one corner of the object, then the opposite corner, to create a box. Use the arrow tool to adjust the edges.

💡 Tips: Enable "Measure Line" in "Display Setting" for guideline assistance.

Annotate objects in all images using this method. Click "Save" when done and exit.

"Preview Annotations" shows the results.

Step 3: Data Annotated

Exporting Data

Click "Export" to create an export task.

Step 4: Exporting Data

Under "Annotation Format", choose the TXT format for YOLO. Click "Create".

Download the results when ready.

Step 4: Results Exported

Each file contains the info needed for training. Here, the system auto-assigned "0" to the single label.

Project Structure

Organize the project like YOLO v7 as the structure is very similar to v9.

Step 5: Project Structure

Why Choose BasicAI Cloud* for YOLO Data Annotation

BasicAI Cloud* is an all-in-one smart data annotation solution that seamlessly integrates with your YOLO workflow, making the annotation process efficient and collaborative.

  • Comprehensive Features: BasicAI Cloud* supports all data types, including images, video, LiDAR fusion, audio, and text. Model-assisted tools enable auto pre-annotation (instance annotation, semantic segmentation, speech recognition) and interactive annotation.

  • Built for Team Collaboration: Scalable project management to integrate external teams and models into custom workflows. Rapidly bulk assigns annotation tasks. Custom real-time QA catches quality issues fast. Detailed performance reports were provided.

  • Dataset Management: Upload pre-annotated data for fine-tuning. Video frame extraction and continuous frame split/merge. Cloud storage integration.

  • Cost: Free accounts have nearly full functionality—5 seats, 200GB storage, 10,000 free auto-labels. They are great for small research teams and competitively priced for larger teams. Enterprise on-prem deployment is available.

By leveraging BasicAI Cloud* for your YOLO data labeling needs, you can streamline the process of preparing high-quality annotated data, collaborate effectively with your team, and manage your dataset with ease. This powerful platform empowers you to focus on developing accurate and efficient YOLO object detection models while minimizing the time and effort spent on data annotation. Click below to build your own datasets:


* To further enhance data security, we discontinue the Cloud version of our data annotation platform since 31st October 2024. Please contact us for a customized private deployment plan that meets your data annotation goals while prioritizing data security.

Read Next:

Step-by-Step Guide to Implementing YOLO Object Detection and Counting (With Code Sample)

YOLO Object Detection Algorithms 101: Part 1


 

Disclaimer

The turtle images used in this article were obtained through Bing Image search and are used solely for educational, non-commercial purposes, such as learning exchange and experience sharing. We do not claim ownership of these images and do not intend to use them for any commercial gain. If you are the rightful owner of any of the images used and believe that their use in this article constitutes copyright infringement, please contact us immediately. Upon receiving a valid request, we will promptly remove the article and the infringing content. We respect the intellectual property rights of others and strive to comply with all applicable copyright laws. If you have any questions or concerns regarding the use of these images, please reach out to us, and we will address the matter accordingly.



Get Project Estimates
Get a Quote Today

Get Essential Training Data
for Your AI Model Today.

bottom of page