学习笔记(04):英特尔®OpenVINO™工具套件中级课程--(第四章)推理引擎优化&内部API

it2023-03-21  77

立即学习:https://edu.csdn.net/course/play/28807/427188?utm_source=blogtoedu

推理引擎优化&内部API

针对Intel构架(IA)的简单统一的推理API接口对多种IA硬件(CPU/iGPU/VPU/FPGA/其他)的推理优化C++和Python支持Ubuntu、CentOS、MacOS、Raspbian、Win10有开源版本

具体的API使用可以参考:

https://docs.openvinotoolkit.org/latest/index.html

主要API函数:

读取IR 文件进入到net对象

net = ie.read_network(model=model_xml, weights=model_bin)

将网络读入CPU插件中:

exec_net = ie.load_network(network=net, device_name='CPU')

得到推理结果的输出:

out = exec_net.infer(inputs={input_blob: image}) 

实验内容:

1) 查看可用设备列表:

使用ie.available_devices使用 print()函数

print(ie.available_devices)

['CPU', 'GNA']

2) 使用不同的设备做推理:     举例,如果你有 HDDL...

        exec_net = ie.load_network(network=net, device_name='HDDL')

3) 增加性能计数器:

根据上一个实验获取到的可用设备,将其选择为进行实验的设备,添加至“device_name”选项。使用performance_counters = exec_net.requests[0].get_perf_counts() 增加性能计数器

程序代码:

#!/usr/bin/env python """ Copyright (C) 2018-2020 Intel Corporation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. """ from __future__ import print_function import sys import cv2 import numpy as np #from openvino.inference_engine import IECore from openvino.inference_engine import IENetwork, IECore def main(): model_xml = 'model.xml' #The DL model, IR format model_bin = 'model.bin' ie = IECore() #Inference-Engine Core object net = ie.read_network(model=model_xml, weights=model_bin) #Read IR input_blob = next(iter(net.inputs)) #first layer of the model out_blob = next(iter(net.outputs)) #last layer net.batch_size = 1 #based on last lab,use available device to exercise ##-->add your device name here, you can choose taget device to use when you have many device exec_net = ie.load_network(network=net, device_name='CPU') n, c, h, w = net.inputs[input_blob].shape #Input dimensions image = np.ndarray(shape=(n, c, h, w)) image = cv2.imread('image1.jpg') #read input image if image.shape[:-1] != (h, w): #resize image to match input sizes/shape image = cv2.resize(image, (w, h)) image = image.transpose((2, 0, 1)) # Change data layout from HWC to CHW out = exec_net.infer(inputs={input_blob: image}) # Inference out = out[out_blob] with open('labels.txt', 'r') as f: #Read labels file labels_map = [x.split(sep=' ', maxsplit=1)[-1].strip() for x in f] for i, probs in enumerate(out): probs = np.squeeze(probs) top_ind = np.argsort(probs)[-10:][::-1] print('\n Class Probability') #print header print('---------------------------------------------') for id in top_ind: det_label = labels_map[id] if labels_map else "{}".format(id) print("{:30}{:.7f}".format(det_label, probs[id])) print("\n") #Performance counters. #--> Your code here.. performance_counters = exec_net.requests[0].get_perf_counts() print('{:<40} {:<15} {:<25} {:<15} {:<10}'.format('name', 'layer_type', 'exet_type', 'status', 'real_time, us')) print("----------------------------------------------------------------------------------------------------------") for layer, stats in performance_counters.items(): print('{:<40} {:<15} {:<25} {:<15} {:<10}'.format(layer, stats['layer_type'], stats['exec_type'],stats['status'], stats['real_time'])) if __name__ == '__main__': sys.exit(main() or 0)

运行结果:

python3 classifiction-2.py

 Class                       Probability --------------------------------------------- cheeseburger                  0.9797323 beigel                        0.0092792 hot dog, red hot              0.0088993 guacamole                     0.0011216 Smith                         0.0005306 potpie                        0.0001503 cuke                          0.0001258 cream, icecream               0.0000298 ball                          0.0000223 bakeshop, bakehouse           0.0000212

name                                     layer_type      exet_type                 status          real_time, us ---------------------------------------------------------------------------------------------------------- conv1                                    Convolution     jit_avx512_FP32           EXECUTED        3184       conv10                                   Convolution     jit_avx512_1x1_FP32       EXECUTED        1474       fire2/concat                             Concat          unknown_FP32              EXECUTED        21         fire2/expand1x1                          Convolution     jit_avx512_1x1_FP32       EXECUTED        54         fire2/expand3x3                          Convolution     jit_avx512_FP32           EXECUTED        403        fire2/relu_expand1x1                     ReLU            undef                     NOT_RUN         0          fire2/relu_expand3x3                     ReLU            undef                     NOT_RUN         0          fire2/relu_squeeze1x1                    ReLU            undef                     NOT_RUN         0          fire2/squeeze1x1                         Convolution     jit_avx512_1x1_FP32       EXECUTED        184        fire3/concat                             Concat          unknown_FP32              EXECUTED        33         fire3/expand1x1                          Convolution     jit_avx512_1x1_FP32       EXECUTED        76         fire3/expand3x3                          Convolution     jit_avx512_FP32           EXECUTED        387        fire3/relu_expand1x1                     ReLU            undef                     NOT_RUN         0          fire3/relu_expand3x3                     ReLU            undef                     NOT_RUN         0          fire3/relu_squeeze1x1                    ReLU            undef                     NOT_RUN         0          fire3/squeeze1x1                         Convolution     jit_avx512_1x1_FP32       EXECUTED        118        fire4/concat                             Concat          unknown_FP32              EXECUTED        25         fire4/expand1x1                          Convolution     jit_avx512_1x1_FP32       EXECUTED        44         fire4/expand3x3                          Convolution     jit_avx512_FP32           EXECUTED        390        fire4/relu_expand1x1                     ReLU            undef                     NOT_RUN         0          fire4/relu_expand3x3                     ReLU            undef                     NOT_RUN         0          fire4/relu_squeeze1x1                    ReLU            undef                     NOT_RUN         0          fire4/squeeze1x1                         Convolution     jit_avx512_1x1_FP32       EXECUTED        76         fire5/concat                             Concat          unknown_FP32              EXECUTED        18         fire5/expand1x1                          Convolution     jit_avx512_1x1_FP32       EXECUTED        44         fire5/expand3x3                          Convolution     jit_avx512_FP32           EXECUTED        394        fire5/relu_expand1x1                     ReLU            undef                     NOT_RUN         0          fire5/relu_expand3x3                     ReLU            undef                     NOT_RUN         0          fire5/relu_squeeze1x1                    ReLU            undef                     NOT_RUN         0          fire5/squeeze1x1                         Convolution     jit_avx512_1x1_FP32       EXECUTED        119        fire6/concat                             Concat          unknown_FP32              EXECUTED        10         fire6/expand1x1                          Convolution     jit_avx512_1x1_FP32       EXECUTED        31         fire6/expand3x3                          Convolution     jit_avx512_FP32           EXECUTED        222        fire6/relu_expand1x1                     ReLU            undef                     NOT_RUN         0          fire6/relu_expand3x3                     ReLU            undef                     NOT_RUN         0          fire6/relu_squeeze1x1                    ReLU            undef                     NOT_RUN         0          fire6/squeeze1x1                         Convolution     jit_avx512_1x1_FP32       EXECUTED        57         fire7/concat                             Concat          unknown_FP32              EXECUTED        2          fire7/expand1x1                          Convolution     jit_avx512_1x1_FP32       EXECUTED        30         fire7/expand3x3                          Convolution     jit_avx512_FP32           EXECUTED        321        fire7/relu_expand1x1                     ReLU            undef                     NOT_RUN         0          fire7/relu_expand3x3                     ReLU            undef                     NOT_RUN         0          fire7/relu_squeeze1x1                    ReLU            undef                     NOT_RUN         0          fire7/squeeze1x1                         Convolution     jit_avx512_1x1_FP32       EXECUTED        55         fire8/concat                             Concat          unknown_FP32              EXECUTED        2          fire8/expand1x1                          Convolution     jit_avx512_1x1_FP32       EXECUTED        46         fire8/expand3x3                          Convolution     jit_avx512_FP32           EXECUTED        376        fire8/relu_expand1x1                     ReLU            undef                     NOT_RUN         0          fire8/relu_expand3x3                     ReLU            undef                     NOT_RUN         0          fire8/relu_squeeze1x1                    ReLU            undef                     NOT_RUN         0          fire8/squeeze1x1                         Convolution     jit_avx512_1x1_FP32       EXECUTED        108        fire9/concat                             Concat          unknown_FP32              EXECUTED        2          fire9/expand1x1                          Convolution     jit_avx512_1x1_FP32       EXECUTED        54         fire9/expand3x3                          Convolution     jit_avx512_FP32           EXECUTED        445        fire9/relu_expand1x1                     ReLU            undef                     NOT_RUN         0          fire9/relu_expand3x3                     ReLU            undef                     NOT_RUN         0          fire9/relu_squeeze1x1                    ReLU            undef                     NOT_RUN         0          fire9/squeeze1x1                         Convolution     jit_avx512_1x1_FP32       EXECUTED        106        out_prob                                 Output          unknown_FP32              NOT_RUN         0          pool1                                    Pooling         jit_avx512_FP32           EXECUTED        936        pool10/reduce                            Pooling         jit_avx512_FP32           EXECUTED        35         pool10/reduce_nChw16c_nchw_prob          Reorder         ref_any_FP32              EXECUTED        72         pool3                                    Pooling         jit_avx512_FP32           EXECUTED        91         pool5                                    Pooling         jit_avx512_FP32           EXECUTED        58         prob                                     SoftMax         jit_avx512_FP32           EXECUTED        21         relu_conv1                               ReLU            undef                     NOT_RUN         0          relu_conv10                              ReLU            undef                     NOT_RUN         0         

 

最新回复(0)