mo --input_model my_model.onnx --output_dir ./optimized_model Here is a Python snippet to run your newly minted IR model:
pip install openvino Assume you have an ONNX export of your PyTorch model: intel deep learning deployment toolkit
Take your slowest production model, run it through the Model Optimizer, and benchmark the result. You will be shocked. Have you used OpenVINO or the Intel DLDT in production? Let me know your latency improvements in the comments below! mo --input_model my_model