Hello,
I understand -- I know this process can be confusing on the first attempt. I'll help however I can! I will respond to a few parts of your last comment. Please note that this is not an ordered list of steps
A) general suggestion
My first suggestion is to follow one of the examples in edgeai-tidl-tools and the custom-model workflow. Take one of our existing models like yolox (pretrained), and see how the examples/osrt_python/ort/onnxrt_ep.py is used to compile and run the model on the host. You can follow the steps in the README to transfer portions of edgeai-tidl-tools (examples, test_data, models, model-artifacts) to the EVM and run the model on static input.
- I recommend this not because it solves your problem. I recommend this so you can learn the development flow on a known working model, since your yolov8 will add more complexity.
- I'll address your model in part D of the response.
B) PTC logs
Your jupyter notebook look like they are using an outdated repository. We migrated several months ago from multiple edgeai- repos for model training and development, and wrapped those into the edgeai-tensorlab repo to make repo-versioning more consistent and controlled. You can continue using these repos if you are intentionally targeting an older SDK version like 9.1, but otherwise I suggest cloning edgeai-tensorlab and working from that repository instead.
I see you modelmaker log mentions you are targeting 9.1 SDK. The modeloptimization tooling was fairly new at this point, and I'm less confident about this version that more recent releases (r9.2 and r10.0 corresponding to SDK versions).
- I also see setup errors for scipy due to a BLAS/LAPACK dependency. This probably requires a system-level package that cannot be installed by pip3. I'd recommend searching with google for guidance here.
C) Modelmaker logs
Looks like modelmaker trained the model but failed to compile. This means that pytorch trained yolox and produced an ONNX file. The next stage is compilation, which uses edgeai-benchmark -- this is an alternative to edgeai-tidl-tools.
- edgeai-benchmark is for testing/compiling with larger datasets. It can compile the model and then test the accuracy of that model on your dataset
- typically used by advanced users or as part of modelmaker workflow
- edgeai-tidl-tools is for baseline evaluation of inference speed and accuracy on a smaller set of data.
- typically used for initial testing on new models. Often, this is sufficient for compiling model artifacts and testing
Your model failed during edgeai-benchmark compilation because it could not find a dependency:
ImportError: cannot import name 'onnx_model_opt' from 'osrt_model_tools.onnx_tools' (/home/zxb/.pyenv/versions/py310/lib/python3.10/site-packages/osrt_model_tools/onnx_tools/__init__.py)
This dependency is setup as part of edgeai-tidl-tools, and the source python is under the /scripts directory (osrt_model_tools/onnx_tools). The imported function is meant to add some preprocessing layers to the model (YOLOX) that will allow uint8 input instead of float. This is why a later error complained about this input data type
- What is the commit tag / branch of edgeai-tidl-tools? It may have setup a more recent version. The appropriate release would be 09_01_07_00.We should be able to update with the proper tools and rerun this
- alteratively, take the trained ONNX model and import using edgeai-tidl-tools standalone. This is related to suggestion A
Wang Xiaojun said:
i followed u advice to use edgeai-optimization,when using PTC example ,FX Graph Mode Quantization is in maintenance mode.
Could you explain more, please? I assume you mean that you followed the steps for modifying the yolov8 model based on the linked issue in edgeai-tensorlab repo. I'm not sure what you mean by "FX Graph Mode Quantization is in maintenance mode"
D) YOLOv8 Guidance
The steps detailed in that issue-7 explain how to train a modified yolov8 that will work well with TI's accelerator (C7xMMA). This training flow uses the upstream mmyolo repo, which is patched with files on that tensorlab issue-7 that will modify model structure to be better for C7xMMA. This does require some retraining.
Since you have a pretrained model, you can use those weights as the starting point for retraining this modified structure. It should not take more than 100 epochs to retrain this way, but depending on your yolov8 source, it could take effort to get the pretrained weights aligned with the most appropriate yolov8-config. The PTH for your yolov8 would need to have tensor/weights named similarly to what the mmyolo repo's yolov8 versions expect.
Wang Xiaojun said:
Our previous projects all used YOLOV8.Should i use trained v8model to optimization (PTQ /QAT PTC).Or use edgeai-yolox/edgeai-mmdetetion to retrain a new v8-ti-lite model offline ,sinceour company's dataset doesn’t allow any uploading.
I would encourage the first option here --
- use your trained model with a patched version of mmyolo to apply our model surgery
- retrain for 25-100 epochs at low learning rate.
- You may also run QAT at this time if you wish. I would suggest not on the first attempt. You may come back and do this later if accuracy from PTQ is not sufficient.
- Export the model to ONNX format with an accompanying prototxt file that describes the detection head.
- Use this ONNX and PROTOTXT to import/compile the model with edgeai-tidl-tools/examples/osrt_python/ort/onnxrt_ep.py.
- Follow the custom-model guidelines in that repo
- this will produce artifacts
BR,
Reese