In this blog, we’ll give you a high level overview of training a YOLOv8 model using Falcon, our digital twin simulation platform. We cover how to create scenarios, generate data from digital twins, train and test your model, and introduce best practices for getting high performance results.
This process is also covered in the video (below) by Duality's Community Manager, Rebekah Bogdanoff, making it easy to follow along with the entire exercise.
Note: This blog offers a glimpse into the content of Falcon EDU Exercises 1 and 2. Anyone can try these exercises for themselves and access the more detailed documentation that walks through the entire process. Simply sign up for the free EDU tier of Falcon, and start today: Create Account - FalconCloud by Duality AI .
For these exercises, we’ll use YOLOv8 (You Only Look Once), a powerful model optimized for lightweight hardware and high-speed performance. While it may not be as advanced as some newer models, its excellent compute-cost-to-performance ratio makes it versatile across various setups. Pre-trained YOLO models save time and boost accuracy, making them a great starting point. While here we focus on Object Detection, YOLOv8 also supports tasks like pose estimation and instance segmentation, which is useful for broader AI explorations.
Object detection is the foundational action for a wide variety of perception modules common on today's robots and AI powered systems, enabling vital tasks that include:
While object detection is a well-established task, it is constantly being improved upon by new models and new techniques. The exercise allows us to introduce a very powerful and common model to individuals that may be newer to this field. It also allows us to teach our users a wide array of vital concepts in an easily accessible exercise.
Synthetic data is generated from simulated real-world environments to create labeled datasets for training AI models. Its benefits include:
For these exercises, we trained YOLOv8 to detect two objects, a cereal box and a soup can, in an indoor setting. Using Falcon, we replicated real-world conditions while introducing randomness for more robust results. This includes object pose variations, visual occlusions, and diverse lighting conditions.
For generating synthetic data, we use FalconEditor- our integrated development environment that lets us create any scenario of interest, deploy virtual sensors, run simulations, and generate data all in one place.
Our documentation (1. Binary Install - Documentation - FalconCloud by Duality AI) walks you through the installation and configuration of FalconEditor. Just create a free EDU account to get started.
The scenarios contain 4 main components:
Note: All of the digital twins and assets needed to run this scenario are provided for free in FalconCloud, which you can access with the EDU account.
While we could place the main object anywhere in the room, and similarly, place the camera in any desired location, our goal is to create an intentional, information rich data set that is indistinguishable from one that could exist in the real-world (read more about how to make this type of robust datasets here). To achieve this, we need to consider some parameters.
A robust dataset includes:
Fully controlling simulation parameters is fast in Falcon. While Exercises 1 and 2 provide fully pre-configured parameters, the instructions still walk through key variables and how to adjust them. These include:
It's a good idea to match the parameters of the simulation camera to those of the real camera used to take the testing images.
Post processing covers all alterations made to an image after it has been generated in the scenario. They’re designed to mimic the artifacts found in photographs taken by real cameras with real optics. These include:
These parameters define the starting conditions of the digital twins in the scenario – in this case the cereal box or soup can. For Exercises 1 and 2, main parameters include:
Falcon can be customized with various modules designed for different synthetic data workflows. Here we’re using FalconVision, a module specifically designed to streamline data generation for training vision models. Once the scenario is set up, running it is literally as easy as pressing “Play”.
The scenario's Python script will guide both the positioning of objects as well as the image capture process. The script will first instruct Falcon to drop each clutter twin at a random location within the designated volume, then drop the chosen object twin, and finally position the camera and capture an image. It then repeats this process, capturing the twin in different locations, positions, and angles within the environment until it has created a full dataset. As the images are generated, they automatically save into output folders that will later be used to train the AI model. The captured images, along with the simultaneously generated YOLO annotations, form our synthetic dataset.
This process takes about 30-40 minutes.
Note: How do we know the annotations are accurate? With digital twins, all of the ground truth information is contained within the scenario, and Falcon knows exactly which pixel belongs to what object, making annotation labels 100% accurate.
The generated images and annotations are automatically separated into two sets: training data and validation data. The model uses the synthetic training data to adjust parameters and “learn” to detect the chosen object. It then uses the separate validation synthetic data to evaluate how well it learned to detect the object in the non-training dataset.
Training can be carried out on your local machine, but learners or casual users can use Google Colab, a free processing service, to train and test their model. This avoids potential installation conflicts or memory concerns. With Google Colab, you can:
Anyone with a Google account already has access to Colab and you can learn more about it here: https://colab.research.google.com/ . We also have instructions in our documentation.
Exercise 1 and Exercise 2 automatically provide training and testing scripts that you can simply run either on your machine or on Google Colab. You can adjust some parameters such as location of the testing data, location of the training data, or number of epochs.
As the model is training, the script outputs epoch progress, loss functions, and mAP50 metrics per epoch. When it is finished, it plots these metrics and outputs them onto a graph that the user can analyze for potential training issues. The exercises outline some of the more common problems such as overfitting, underfitting, or diverging loss. See our documentation for more.
While this process does output a mAP50 score, it’s important to note that this is based on only the synthetic data at this point in the process. To know if the model truly works in the physical world, we have to test it using real-world images.
As with the training above, the testing scripts for Exercises 1 and 2 are already set up so that the user just has to run the testing script, and it will test the model using annotated real-world images that we provide. The extension for Exercise 2 outlines how to take and annotate your own real-world images, so that more advanced users can create their own testing set, and tweak their simulation to provide aligned synthetic images.
Once tested, the script outputs the following:
After the first round of training, you may want to push your mAP50 even higher; for example, manufacturers might want an mAP50 at or greater than .99. A strong advantage of synthetic training data over real-world training data is that we can easily go back and create new training data that will create a more robust model, tweaking parameters, and extending the dataset to provide the model with the information it needs to learn and perform well.
So what kind of changes can we introduce in our synthetic data to improve model performance? And how does simulation make this a breeze?
We don’t necessarily need to adjust ALL of these parameters for successful training. This is another area where synthetic data makes it easy to quickly try out variations to find what produces better results — a much more difficult task with real-world data.
Exercises 1 and 2 are designed to equip users with key AI training knowledge, baseline simulation skills, as well as basic scripts and functions needed to begin creating their own projects. Each exercise ends with a challenge for users to either continue improving the model or to edit the simulation for their own novel twins.
For learners of all levels Duality is regularly offering live courses to teach more niche and intensive skills. These include: digital twin creation, blueprint breakdowns, simulation setups, and many more. All of these resources are designed to lower the barrier for anyone looking to take advantage of the vast possibilities offered by synthetic data to build smarter, more versatile AI models.
Ready to get started? Create your FREE Falcon EDU account to try these exercises for yourself. And then start your own synthetic data projects for any application you can think of!