Digit Recognition Optimization Series 00: Data Collection & Testing
Digit Recognition Optimization Series 00: Data Collection & Testing
Introduction
In the previous series, I completed the digit recognition project using LVGL.
However, the initial accuracy of digit recognition was not satisfactory.
Therefore, in this new series, I will focus on optimizing the model based on the current hardware setup.
The first step is to generate a test dataset directly on the target hardware to accurately evaluate performance.
The original dataset provided by Espressif was generated using a touchpad as input.
In contrast, I am now using the LVGL canvas (128 x 128) as input, and then applying linear interpolation to compress the data to 30 x 25 pixels.
How to Efficiently Create the Dataset
There are two approaches I considered:
- Save the pixel data to an SD card.
- Send the data via UART to a PC.
After evaluating both options, I chose option 2 (UART to PC), as it is more flexible and requires less code on the ESP32 side.
A Python script can then be used to handle and process the dataset.
Implementation
ESP32 Side
- Add a function to send the data via UART using a simple protocol:
- Start with the string:
START,
- End with the string:
,END
followed by a newline\r\n
- Data is encoded as integer ASCII values, separated by commas.
- Start with the string:
By using the API below, you can send raw data output without additional parasitic log messages:
1 | void esp_log_write(esp_log_level_t level, const char *tag, const char *format, ...) |
The core function in C for testing model accuracy is straightforward.
digit_test_data
is a 2D array: [number of samples] x [750 pixels].digit_test_label
is a 1D array representing the digit value for each row indigit_test_data
.
Code to Send Pixel Data to the PC
1 | if (xQueueReceive(xImageQueue, &image_data, portMAX_DELAY) == pdTRUE) |
Note: ESP32 automatically converts
"\n"
to"\r\n"
in log output.
Code to Test the Samples
1 | for (int i = 0; i < DIGIT_TEST_NUM; i++) |
Python Side
Two scripts were developed:
data_collect_ui.py
: A GUI for collecting pixel data.c_code_generate.py
: Converts the pixel data to a C array.
data_collect_ui.py
- Saves all bytes into a buffer.
- When a
\n
is received, it decodes the data. - Saves the data to a file in PNG format.
c_code_generate.py
- Converts the PNG files to C arrays.
Full code is available at:
https://github.com/tommokmok/esp32s3_lvgl_digit_recongnition/tree/test/datatset_create
Lessons Learned
- When using
esp_log_write
to send data, note that ESP-IDF will automatically convert"\n"
to"\r\n"
. This may be configurable—worth investigating further. - Key takeaways from the Python code:
- The output of
self.ser.read(1024)
is in bytes, so special characters like"\n"
are represented as hex code0x0A
. - ESP32 sends out
"\r\n"
instead of just"\n"
, even if only"\n"
is specified in the code. raw_data = parts[1:-1]
excludes the last element, so"END\r\n"
is not included in the data.- The
self.buffer
must be cleared before the next parse; otherwise, data will accumulate and cause errors.
- The output of
1 | def run(self): |
Note: There is a bug in the above code—sometimes, two data arrays are received at once. Before saving to a PNG file, always check the array size.
1 | def save_as_png_if_valid(self, str_list): |
Test Results
- Accuracy: ~88%
Next Steps
- Retrain the model using the new dataset collected from the current hardware.