Digit Recognition Optimization Series 01: Re-training the Model

Introduction

In the previous post, I developed data collection tools to create a custom dataset using the current hardware setup.

By using the Python UI test tool, I was able to collect my own dataset directly from the ESP32 device.

I spent around 1 hour creating 100 samples for each digit (0-9).

Unfortunately, after merging these new samples with the original dataset and re-training the model, the accuracy did not improve.

There could be several reasons for this:

  • The self-created sample size of 100 per digit may not be sufficient to improve accuracy.
  • There might be issues with the re-training process itself.

At this point, I am not going to dive deeper to troubleshoot the exact cause.
Instead, this blog will document what I have done and record the lessons learned for future reference.

Work Done

  1. Used the following tool to capture handwritten image data from the ESP32:
    esp32s3_lvgl_digit_recongnition/tree/test/datatset_create

  2. Merged the new images into the original dataset provided by Espressif.

  3. Re-trained the model using Google Colab.
    The steps followed are the same as described in Learning ESP-DL Series 02: Walkthrough on Training, Testing, and Deployment, but with the updated dataset.

  4. Flashed the updated model to the ESP32 and tested its performance.

  5. Result:

    • Accuracy: ~88% (no improvement compared to before).

Lessons Learned

  1. Saving Model State for Re-training:
    If you want to continue training an existing model with a new dataset, you need to save both the state_dict of the model and the optimizer state_dict.
    Otherwise, the re-training process may not work as expected, according to the PyTorch documentation.

    save-more-data

    Therefore, if you plan to re-train the model with new data, always save both the model and optimizer states.

  2. Understanding PyTorch Model Files:
    Pay attention to what is actually stored in your PyTorch model files.
    The .pth or .pt file format is flexible and defined by the model owner—it may contain only the model parameters, or it may also include the optimizer state and other metadata.

    If you are unsure about the contents of a model file, try importing it in PyTorch.
    If the file does not match your expectations, you may encounter errors like the one below:

    load-model-error


By documenting these steps and lessons, I hope to make future optimization and troubleshooting more efficient.