Digit Recognition Optimization Series 00: Data Collection & Testing
Digit Recognition Optimization Series 00: Data Collection & Testing
Introduction
In the previous series, I completed the digit recognition project using LVGL.
However, the initial accuracy of digit recognition was not satisfactory.
Therefore, in this new series, I will focus on optimizing the model based on the current hardware setup.
The first step is to generate a test dataset directly on the target hardware to accurately evaluate performance.
The original dataset provided by Espressif was generated using a touchpad as input.
In contrast, I am now using the LVGL canvas (128 x 128) as input, and then applying linear interpolation to compress the data to 30 x 25 pixels.
How to Efficiently Create the Dataset
There are two approaches I considered:
- Save the pixel data to an SD card.
- Send the data via UART to a PC.
After evaluating both options, I chose option 2 (UART to PC), as it is more flexible and requires less code on the ESP32 side.
A Python script can then be used to handle and process the dataset.
Implementation
ESP32 Side
- Add a function to send the data via UART using a simple protocol:
- Start with the string:
START, - End with the string:
,ENDfollowed by a newline\r\n - Data is encoded as integer ASCII values, separated by commas.
- Start with the string:
By using the API below, you can send raw data output without additional parasitic log messages:
1 | void esp_log_write(esp_log_level_t level, const char *tag, const char *format, ...) |
The core function in C for testing model accuracy is straightforward.
digit_test_datais a 2D array: [number of samples] x [750 pixels].digit_test_labelis a 1D array representing the digit value for each row indigit_test_data.
Code to Send Pixel Data to the PC
1 | if (xQueueReceive(xImageQueue, &image_data, portMAX_DELAY) == pdTRUE) |
Note: ESP32 automatically converts
"\n"to"\r\n"in log output.
Code to Test the Samples
1 | for (int i = 0; i < DIGIT_TEST_NUM; i++) |
Python Side
Two scripts were developed:
data_collect_ui.py: A GUI for collecting pixel data.c_code_generate.py: Converts the pixel data to a C array.
data_collect_ui.py
- Saves all bytes into a buffer.
- When a
\nis received, it decodes the data. - Saves the data to a file in PNG format.
c_code_generate.py
- Converts the PNG files to C arrays.
Full code is available at:
https://github.com/tommokmok/esp32s3_lvgl_digit_recongnition/tree/test/datatset_create
Lessons Learned
- When using
esp_log_writeto send data, note that ESP-IDF will automatically convert"\n"to"\r\n". This may be configurable—worth investigating further. - Key takeaways from the Python code:
- The output of
self.ser.read(1024)is in bytes, so special characters like"\n"are represented as hex code0x0A. - ESP32 sends out
"\r\n"instead of just"\n", even if only"\n"is specified in the code. raw_data = parts[1:-1]excludes the last element, so"END\r\n"is not included in the data.- The
self.buffermust be cleared before the next parse; otherwise, data will accumulate and cause errors.
- The output of
1 | def run(self): |
Note: There is a bug in the above code—sometimes, two data arrays are received at once. Before saving to a PNG file, always check the array size.
1 | def save_as_png_if_valid(self, str_list): |
Test Results
- Accuracy: ~88%
Next Steps
- Retrain the model using the new dataset collected from the current hardware.
