YOLOv5-学习

it2024-04-12 61

文章目录

1.参考资料2.使用源码图片测试2.1．下载源码2.2 安装依赖库2.3 下载模型２.4 测试结果 3.训练自己数据3.1参考3.2 train coco1283.2.1训练结果3.2.2查看训练曲线 3.3 安全帽目标检测(ubuntu16.04)３.3.1 数据及预处理3.3.2 修改配置文件３.3.3 预训练3.3.4 测试 3.4yolov5s训练100epochs3.4.1 锚框聚类3.4.2 train3.4.3 test.py 3.5 yolov5s训练600３.6 yolov5x实测(ubuntu16.04)3.6.1 train.py3.6.2 test.py3.6.3 detect.py 3.7 yolov5再测（windows10）3.7.1.准备3.7.2 训练

1.参考资料

源码-https://github.com/ultralytics/yoloV5-U版YOLOv5-20200625如何评价YOLOv5？一文读懂YOLO V5 与 YOLO V4yolov4-AB源论文-YOLOv4: Optimal Speed and Accuracy of Object Detection-20200423深入浅出YOLOv5深入浅出Yolo系列之Yolov5核心基础知识完整讲解Yolov3&Yolov4&Yolov5模型权重及网络结构图资源下载

2.使用源码图片测试

2.1．下载源码

down zip

2.2 安装依赖库

pip install -r requirements.txt

或者自己:

pip install +X

2.3 下载模型

detect.py直接运行会自动下载yolov5s.pt模型文件，但是很慢。。。

可以自己找网址下载： https://github.com/ultralytics/yolov5/releases （放在代码detect.py同级目录下。。。）

２.4 测试结果

Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', img_size=640, iou_thres=0.45, output='inference/output', save_conf=False, save_txt=False, source='inference/images', update=False, view_img=False, weights='yolov5s.pt') Using CUDA device0 _CudaDeviceProperties(name='GeForce RTX 2080 Ti', total_memory=11011MB) Fusing layers... Model Summary: 140 layers, 7.45958e+06 parameters, 0 gradients image 1/2 /home/hjz/PycharmProjects/pythonProject/yolov5-master/inference/images/bus.jpg: 640x480 4 persons, 1 buss, 1 skateboards, Done. (0.069s) image 2/2 /home/hjz/PycharmProjects/pythonProject/yolov5-master/inference/images/zidane.jpg: 384x640 2 persons, 2 ties, Done. (0.054s) Results saved to inference/output Done. (0.168s) Process finished with exit code 0

视频：

python detect.py --source=inference/int/0１.mp4 --output=inference/out/0１.mp4 coco_classes_names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']

3.训练自己数据

3.1参考

官方－－https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data使用YOLOv5训练自己的数据集Pytorch版YOLOV5训练自己的数据集

3.2 train coco128

python train.py --epochs=20

自己下载权重和coco128数据集，太慢自己下载： coco128数据集 yolov5权重等下载Ｖ3.0

1. coco128数据集放在项目同级目录下，和ｙｏｌｏv5同级

3.2.1训练结果

Using CUDA device0 _CudaDeviceProperties(name='GeForce RTX 2080 Ti', total_memory=11011MB) Namespace(adam=False, batch_size=16, bucket='', cache_images=False, cfg='', data='data/coco128.yaml', device='', epochs=20, evolve=False, global_rank=-1, hyp='data/hyp.scratch.yaml', image_weights=False, img_size=[640, 640], local_rank=-1, logdir='runs/', multi_scale=False, name='', noautoanchor=False, nosave=False, notest=False, rect=False, resume=False, single_cls=False, sync_bn=False, total_batch_size=16, weights='yolov5s.pt', workers=8, world_size=1) Start Tensorboard with "tensorboard --logdir runs/", view at http://localhost:6006/ Hyperparameters {'lr0': 0.01, 'lrf': 0.2, 'momentum': 0.937, 'weight_decay': 0.0005, 'warmup_epochs': 3.0, 'warmup_momentum': 0.8, 'warmup_bias_lr': 0.1, 'box': 0.05, 'cls': 0.5, 'cls_pw': 1.0, 'obj': 1.0, 'obj_pw': 1.0, 'iou_t': 0.2, 'anchor_t': 4.0, 'fl_gamma': 0.0, 'hsv_h': 0.015, 'hsv_s': 0.7, 'hsv_v': 0.4, 'degrees': 0.0, 'translate': 0.1, 'scale': 0.5, 'shear': 0.0, 'perspective': 0.0, 'flipud': 0.0, 'fliplr': 0.5, 'mosaic': 1.0, 'mixup': 0.0} from n params module arguments 0 -1 1 3520 models.common.Focus [3, 32, 3] 1 -1 1 18560 models.common.Conv [32, 64, 3, 2] 2 -1 1 19904 models.common.BottleneckCSP [64, 64, 1] 3 -1 1 73984 models.common.Conv [64, 128, 3, 2] 4 -1 1 161152 models.common.BottleneckCSP [128, 128, 3] 5 -1 1 295424 models.common.Conv [128, 256, 3, 2] 6 -1 1 641792 models.common.BottleneckCSP [256, 256, 3] 7 -1 1 1180672 models.common.Conv [256, 512, 3, 2] 8 -1 1 656896 models.common.SPP [512, 512, [5, 9, 13]] 9 -1 1 1248768 models.common.BottleneckCSP [512, 512, 1, False] 10 -1 1 131584 models.common.Conv [512, 256, 1, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 models.common.Concat [1] 13 -1 1 378624 models.common.BottleneckCSP [512, 256, 1, False] 14 -1 1 33024 models.common.Conv [256, 128, 1, 1] 15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 16 [-1, 4] 1 0 models.common.Concat [1] 17 -1 1 95104 models.common.BottleneckCSP [256, 128, 1, False] 18 -1 1 147712 models.common.Conv [128, 128, 3, 2] 19 [-1, 14] 1 0 models.common.Concat [1] 20 -1 1 313088 models.common.BottleneckCSP [256, 256, 1, False] 21 -1 1 590336 models.common.Conv [256, 256, 3, 2] 22 [-1, 10] 1 0 models.common.Concat [1] 23 -1 1 1248768 models.common.BottleneckCSP [512, 512, 1, False] 24 [17, 20, 23] 1 229245 models.yolo.Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]] Model Summary: 191 layers, 7.46816e+06 parameters, 7.46816e+06 gradients Transferred 370/370 items from yolov5s.pt Optimizer groups: 62 .bias, 70 conv.weight, 59 other Scanning images: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 128/128 [00:00<00:00, 3224.53it/s] Scanning labels ../coco128/labels/train2017.cache (126 found, 0 missing, 2 empty, 0 duplicate, for 128 images): 128it [00:00, 4429.52it/s] Scanning labels ../coco128/labels/train2017.cache (126 found, 0 missing, 2 empty, 0 duplicate, for 128 images): 128it [00:00, 14134.51it/s] Analyzing anchors... anchors/target = 4.26, Best Possible Recall (BPR) = 0.9946 Image sizes 640 train, 640 test Using 8 dataloader workers Logging results to runs/exp1 Starting training for 20 epochs... Epoch gpu_mem box obj cls total targets img_size 0/19 5.24G 0.04188 0.06183 0.01566 0.1194 171 640: 100%|██████████████████████████████████████████████| 8/8 [00:05<00:00, 1.46it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:04<00:00, 1.77it/s] all 128 929 0.405 0.761 0.698 0.442 Epoch gpu_mem box obj cls total targets img_size 1/19 5.12G 0.04172 0.05666 0.01659 0.115 146 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.57it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.97it/s] all 128 929 0.399 0.765 0.699 0.447 Epoch gpu_mem box obj cls total targets img_size 2/19 5.12G 0.0426 0.06244 0.01579 0.1208 196 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.65it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.07it/s] all 128 929 0.404 0.773 0.702 0.453 Epoch gpu_mem box obj cls total targets img_size 3/19 5.12G 0.04476 0.06601 0.01603 0.1268 204 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.51it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.67it/s] all 128 929 0.396 0.778 0.705 0.455 Epoch gpu_mem box obj cls total targets img_size 4/19 5.12G 0.04329 0.06541 0.01635 0.1251 252 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.58it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.62it/s] all 128 929 0.39 0.781 0.706 0.458 Epoch gpu_mem box obj cls total targets img_size 5/19 5.12G 0.043 0.05926 0.01625 0.1185 146 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.51it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.25it/s] all 128 929 0.39 0.785 0.713 0.463 Epoch gpu_mem box obj cls total targets img_size 6/19 5.12G 0.04202 0.06307 0.01541 0.1205 204 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.71it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.38it/s] all 128 929 0.388 0.791 0.719 0.467 Epoch gpu_mem box obj cls total targets img_size 7/19 5.12G 0.04285 0.06677 0.0151 0.1247 204 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.55it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.91it/s] all 128 929 0.388 0.794 0.723 0.474 Epoch gpu_mem box obj cls total targets img_size 8/19 5.12G 0.04252 0.05974 0.01529 0.1176 211 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.64it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.37it/s] all 128 929 0.386 0.794 0.726 0.48 Epoch gpu_mem box obj cls total targets img_size 9/19 5.12G 0.04098 0.06076 0.01374 0.1155 227 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.52it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.55it/s] all 128 929 0.395 0.799 0.73 0.477 Epoch gpu_mem box obj cls total targets img_size 10/19 5.12G 0.04312 0.06949 0.0154 0.128 185 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.53it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.93it/s] all 128 929 0.393 0.798 0.74 0.483 Epoch gpu_mem box obj cls total targets img_size 11/19 5.12G 0.04207 0.05844 0.0155 0.116 190 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.64it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.47it/s] all 128 929 0.4 0.802 0.744 0.489 Epoch gpu_mem box obj cls total targets img_size 12/19 5.12G 0.04147 0.06319 0.01335 0.118 234 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.57it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.47it/s] all 128 929 0.404 0.801 0.747 0.493 Epoch gpu_mem box obj cls total targets img_size 13/19 5.12G 0.04178 0.0565 0.01371 0.112 225 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.52it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.83it/s] all 128 929 0.419 0.808 0.751 0.498 Epoch gpu_mem box obj cls total targets img_size 14/19 5.12G 0.04076 0.05859 0.01472 0.1141 179 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.57it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.78it/s] all 128 929 0.408 0.815 0.751 0.496 Epoch gpu_mem box obj cls total targets img_size 15/19 5.12G 0.04175 0.05848 0.01484 0.1151 181 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.41it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.17it/s] all 128 929 0.413 0.813 0.754 0.502 Epoch gpu_mem box obj cls total targets img_size 16/19 5.12G 0.04283 0.05989 0.01417 0.1169 198 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.52it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 9.83it/s] all 128 929 0.415 0.82 0.754 0.503 Epoch gpu_mem box obj cls total targets img_size 17/19 5.12G 0.04006 0.05161 0.01465 0.1063 156 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.54it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.18it/s] all 128 929 0.421 0.827 0.76 0.505 Epoch gpu_mem box obj cls total targets img_size 18/19 5.12G 0.04003 0.06271 0.01228 0.115 196 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.48it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:00<00:00, 8.28it/s] all 128 929 0.42 0.826 0.767 0.509 Epoch gpu_mem box obj cls total targets img_size 19/19 5.12G 0.04196 0.06346 0.01286 0.1183 221 640: 100%|██████████████████████████████████████████████| 8/8 [00:01<00:00, 6.54it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|██████████████████████████████████| 8/8 [00:01<00:00, 4.44it/s] all 128 929 0.411 0.822 0.77 0.515 Optimizer stripped from runs/exp1/weights/last.pt, 15.2MB Optimizer stripped from runs/exp1/weights/best.pt, 15.2MB 20 epochs completed in 0.016 hours.

3.2.2查看训练曲线

(pytorch) tensorboard --logdir runs TensorFlow installation not found - running with reduced feature set. Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all TensorBoard 2.3.0 at http://localhost:6006/ (Press CTRL+C to quit)

3.3 安全帽目标检测(ubuntu16.04)

３.3.1 数据及预处理

数据集

It includes 7581 images with 9044 human safety helmet wearing objects(positive) and 111514 normal head objects(not wearing or negative) 1. 标签：hat&person二类 2. 难点：在于把数据集划分训练测试和标签 3. 另外，数据有几个是.JPG需要改成小写.jpg（ubuntu16.04）

数据集和标签比较好用的代码如下：gen_train_test_label.py

""" 1.要修改各文件夹路径２．类别标签按自己从０修改，此处二类为０－１３．此代码路径是ubuntu16.04系统绝对路径 """ import os from pathlib import Path from shutil import copyfile from PIL import Image, ImageDraw from xml.dom.minidom import parse import numpy as np FILE_ROOT = f"/home/hjz/PycharmProjects/pythonProject"+"/" IMAGE_SET_ROOT = FILE_ROOT + f"VOC2028/ImageSets/Main" # 图片区分文件的路径 IMAGE_PATH = FILE_ROOT + f"VOC2028/JPEGImages" # 图片的位置 ANNOTATIONS_PATH = FILE_ROOT + f"VOC2028/Annotations" # 数据集标签文件的位置 LABELS_ROOT = FILE_ROOT + f"VOC2028/Labels" # 进行归一化之后的标签位置 DEST_IMAGES_PATH = f"./custom_data/images" # 区分训练集、测试集、验证集的图片目标路径 DEST_LABELS_PATH = f"./custom_data/labels" # 区分训练集、测试集、验证集的标签文件目标路径 def cord_converter(size, box): """ 将标注的 xml 文件标注转换为 darknet 形的坐标 :param size: 图片的尺寸： [w,h] :param box: anchor box 的坐标 [左上角x,左上角y,右下角x,右下角y,] :return: 转换后的 [x,y,w,h] """ x1 = int(box[0]) y1 = int(box[1]) x2 = int(box[2]) y2 = int(box[3]) dw = np.float32(1. / int(size[0])) dh = np.float32(1. / int(size[1])) w = x2 - x1 h = y2 - y1 x = x1 + (w / 2) y = y1 + (h / 2) x = x * dw w = w * dw y = y * dh h = h * dh return [x, y, w, h] def save_file(img_jpg_file_name, size, img_box): save_file_name = LABELS_ROOT + '/' + img_jpg_file_name + '.txt' print(save_file_name) file_path = open(save_file_name, "a+") for box in img_box: if box[0] == 'person': cls_num = 0 else: cls_num = 1#两个类别 new_box = cord_converter(size, box[1:]) file_path.write(f"{cls_num} {new_box[0]} {new_box[1]} {new_box[2]} {new_box[3]}\n") file_path.flush() file_path.close() def test_dataset_box_feature(file_name, point_array): """ 使用样本数据测试数据集的建议框 :param image_name: 图片文件名 :param point_array: 全部的点 [建议框sx1,sy1,sx2,sy2] :return: None """ im = Image.open(rf"{IMAGE_PATH}\{file_name}") imDraw = ImageDraw.Draw(im) for box in point_array: x1 = box[1] y1 = box[2] x2 = box[3] y2 = box[4] imDraw.rectangle((x1, y1, x2, y2), outline='red') im.show() def get_xml_data(file_path, img_xml_file): img_path = file_path + '/' + img_xml_file + '.xml' print(img_path) dom = parse(img_path) root = dom.documentElement img_name = root.getElementsByTagName("filename")[0].childNodes[0].data img_size = root.getElementsByTagName("size")[0] objects = root.getElementsByTagName("object") img_w = img_size.getElementsByTagName("width")[0].childNodes[0].data img_h = img_size.getElementsByTagName("height")[0].childNodes[0].data img_c = img_size.getElementsByTagName("depth")[0].childNodes[0].data # print("img_name:", img_name) # print("image_info:(w,h,c)", img_w, img_h, img_c) img_box = [] for box in objects: cls_name = box.getElementsByTagName("name")[0].childNodes[0].data x1 = int(box.getElementsByTagName("xmin")[0].childNodes[0].data) y1 = int(box.getElementsByTagName("ymin")[0].childNodes[0].data) x2 = int(box.getElementsByTagName("xmax")[0].childNodes[0].data) y2 = int(box.getElementsByTagName("ymax")[0].childNodes[0].data) # print("box:(c,xmin,ymin,xmax,ymax)", cls_name, x1, y1, x2, y2) img_jpg_file_name = img_xml_file + '.jpg' img_box.append([cls_name, x1, y1, x2, y2]) # print(img_box) # test_dataset_box_feature(img_jpg_file_name, img_box) save_file(img_xml_file, [img_w, img_h], img_box) def copy_data(img_set_source, img_labels_root, imgs_source, type): file_name = img_set_source + '/' + type + ".txt" file = open(file_name) # 判断文件夹是否存在，不存在则创建 root_file = Path(FILE_ROOT + DEST_IMAGES_PATH + '/' + type) if not root_file.exists(): print(f"Path {root_file} is not exit") os.makedirs(root_file) root_file = Path(FILE_ROOT + DEST_LABELS_PATH + '/' + type) if not root_file.exists(): print(f"Path {root_file} is not exit") os.makedirs(root_file) # 遍历文件夹 for line in file.readlines(): print(line) img_name = line.strip('\n') img_sor_file = imgs_source + '/' + img_name + '.jpg' label_sor_file = img_labels_root + '/' + img_name + '.txt' # print(img_sor_file) # print(label_sor_file) # im = Image.open(rf"{img_sor_file}") # im.show() # 复制图片 DICT_DIR = FILE_ROOT + DEST_IMAGES_PATH + '/' + type img_dict_file = DICT_DIR + '/' + img_name + '.jpg' copyfile(img_sor_file, img_dict_file) # 复制 label DICT_DIR = FILE_ROOT + DEST_LABELS_PATH + '/' + type img_dict_file = DICT_DIR + '/' + img_name + '.txt' copyfile(label_sor_file, img_dict_file) if __name__ == '__main__': # 生成标签 root = ANNOTATIONS_PATH files = os.listdir(root) for file in files: print("file name: ", file) file_xml = file.split(".") get_xml_data(root, file_xml[0]) # 将文件进行 train 和 val 的区分 img_set_root = IMAGE_SET_ROOT imgs_root = IMAGE_PATH img_labels_root = LABELS_ROOT copy_data(img_set_root, img_labels_root, imgs_root, "train") copy_data(img_set_root, img_labels_root, imgs_root, "val") copy_data(img_set_root, img_labels_root, imgs_root, "test")

3.3.2 修改配置文件

hat.yaml: # Custom data for safety helmet # train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/] train: /home/hjz/PycharmProjects/pythonProject/custom_data/images/train val: /home/hjz/PycharmProjects/pythonProject/custom_data/images/val test: /home/hjz/PycharmProjects/pythonProject/custom_data/images/test # number of classes nc: 2 # class names names: ['person', 'hat'] hat_yolov5s.yaml # parameters nc: 2 # number of classes depth_multiple: 0.33 # model depth multiple width_multiple: 0.50 # layer channel multiple # anchors,可以后期修改 anchors: - [10,13, 16,30, 33,23] # P3/8 - [30,61, 62,45, 59,119] # P4/16 - [116,90, 156,198, 373,326] # P5/32 # YOLOv5 backbone backbone: # [from, number, module, args] [[-1, 1, Focus, [64, 3]], # 0-P1/2 [-1, 1, Conv, [128, 3, 2]], # 1-P2/4 [-1, 3, BottleneckCSP, [128]], [-1, 1, Conv, [256, 3, 2]], # 3-P3/8 [-1, 9, BottleneckCSP, [256]], [-1, 1, Conv, [512, 3, 2]], # 5-P4/16 [-1, 9, BottleneckCSP, [512]], [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32 [-1, 1, SPP, [1024, [5, 9, 13]]], [-1, 3, BottleneckCSP, [1024, False]], # 9 ] # YOLOv5 head head: [[-1, 1, Conv, [512, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 6], 1, Concat, [1]], # cat backbone P4 [-1, 3, BottleneckCSP, [512, False]], # 13 [-1, 1, Conv, [256, 1, 1]], [-1, 1, nn.Upsample, [None, 2, 'nearest']], [[-1, 4], 1, Concat, [1]], # cat backbone P3 [-1, 3, BottleneckCSP, [256, False]], # 17 (P3/8-small) [-1, 1, Conv, [256, 3, 2]], [[-1, 14], 1, Concat, [1]], # cat head P4 [-1, 3, BottleneckCSP, [512, False]], # 20 (P4/16-medium) [-1, 1, Conv, [512, 3, 2]], [[-1, 10], 1, Concat, [1]], # cat head P5 [-1, 3, BottleneckCSP, [1024, False]], # 23 (P5/32-large) [[17, 20, 23], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5) ]

３.3.3 预训练

python train.py --data=data/hat.yaml --cfg=data/hat_yolov5s.yaml --batch-size=16 --epochs=10 Analyzing anchors... anchors/target = 4.25, Best Possible Recall (BPR) = 0.9999 Image sizes 640 train, 640 test Using 8 dataloader workers Logging results to runs/exp14 Starting training for 10 epochs... Epoch gpu_mem box obj cls total targets img_size 0/9 4.51G 0.08594 0.07445 0.01321 0.1736 39 640: 100%|██████████████████████████████████████████| 342/342 [00:54<00:00, 6.28it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:09<00:00, 3.96it/s] all 607 2.98e+04 0.221 0.288 0.21 0.0712 Epoch gpu_mem box obj cls total targets img_size 1/9 4.58G 0.0641 0.067 0.004142 0.1352 9 640: 100%|██████████████████████████████████████████| 342/342 [00:47<00:00, 7.17it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:04<00:00, 9.50it/s] all 607 2.98e+04 0.365 0.3 0.251 0.106 Epoch gpu_mem box obj cls total targets img_size 2/9 4.58G 0.05703 0.06752 0.002748 0.1273 273 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 6.98it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:06<00:00, 5.97it/s] all 607 2.98e+04 0.406 0.311 0.273 0.144 Epoch gpu_mem box obj cls total targets img_size 3/9 4.58G 0.04976 0.06421 0.002333 0.1163 6 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 6.98it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.33it/s] all 607 2.98e+04 0.616 0.307 0.304 0.16 Epoch gpu_mem box obj cls total targets img_size 4/9 4.58G 0.04688 0.06446 0.001753 0.1131 273 640: 100%|██████████████████████████████████████████| 342/342 [00:49<00:00, 6.98it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.38it/s] all 607 2.98e+04 0.645 0.309 0.306 0.177 Epoch gpu_mem box obj cls total targets img_size 5/9 4.58G 0.04377 0.06128 0.001416 0.1065 30 640: 100%|██████████████████████████████████████████| 342/342 [00:49<00:00, 6.96it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.42it/s] all 607 2.98e+04 0.627 0.312 0.307 0.178 Epoch gpu_mem box obj cls total targets img_size 6/9 4.58G 0.04228 0.0616 0.001187 0.1051 243 640: 100%|██████████████████████████████████████████| 342/342 [00:49<00:00, 6.91it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.67it/s] all 607 2.98e+04 0.679 0.312 0.309 0.185 Epoch gpu_mem box obj cls total targets img_size 7/9 4.58G 0.04071 0.05956 0.001062 0.1013 15 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 7.01it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.54it/s] all 607 2.98e+04 0.675 0.312 0.309 0.188 Epoch gpu_mem box obj cls total targets img_size 8/9 4.58G 0.04015 0.0596 0.0008846 0.1006 48 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 7.00it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:05<00:00, 6.96it/s] all 607 2.98e+04 0.688 0.312 0.31 0.189 Epoch gpu_mem box obj cls total targets img_size 9/9 4.58G 0.03959 0.0595 0.0007798 0.09986 39 640: 100%|██████████████████████████████████████████| 342/342 [00:48<00:00, 7.00it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 38/38 [00:06<00:00, 5.79it/s] all 607 2.98e+04 0.695 0.313 0.312 0.192 Optimizer stripped from runs/exp14/weights/last.pt, 14.8MB Optimizer stripped from runs/exp14/weights/best.pt, 14.8MB 10 epochs completed in 0.156 hours.

3.3.4 测试

python detect.py --weights=/home/hjz/PycharmProjects/pythonProject/01-yolov5-master/runs/exp14/weights/best.pt --source=/home/hjz/PycharmProjects/pythonProject/01-yolov5-master/inference/int/01.jpg

15分钟10个epochs跑出来的效果如图：

3.4yolov5s训练100epochs

3.4.1 锚框聚类

gen_anchors_box_kmeans.py

# -*- coding: utf-8 -*- import numpy as np import random import argparse import os # 参数名称 parser = argparse.ArgumentParser(description='使用该脚本生成YOLO-V3的anchor boxes\n') parser.add_argument('--input_annotation_txt_dir', required=True, type=str, help='输入存储图片的标注txt文件(注意不要有中文)') parser.add_argument('--output_anchors_txt', required=True, type=str, help='输出的存储Anchor boxes的文本文件') parser.add_argument('--input_num_anchors', required=True, default=6, type=int, help='输入要计算的聚类（Anchor boxes的个数）') parser.add_argument('--input_cfg_width', required=True, type=int, help="配置文件中width") parser.add_argument('--input_cfg_height', required=True, type=int, help="配置文件中height") args = parser.parse_args() ''' centroids 聚类点尺寸是 numx2,类型是ndarray annotation_array 其中之一的标注框 ''' def IOU(annotation_array, centroids): # similarities = [] # 其中一个标注框 w, h = annotation_array for centroid in centroids: c_w, c_h = centroid if c_w >= w and c_h >= h: # 第1中情况 similarity = w * h / (c_w * c_h) elif c_w >= w and c_h <= h: # 第2中情况 similarity = w * c_h / (w * h + (c_w - w) * c_h) elif c_w <= w and c_h >= h: # 第3种情况 similarity = c_w * h / (w * h + (c_h - h) * c_w) else: # 第3种情况 similarity = (c_w * c_h) / (w * h) similarities.append(similarity) # 将列表转换为ndarray return np.array(similarities, np.float32) # 返回的是一维数组，尺寸为(num,) ''' k_means:k均值聚类 annotations_array 所有的标注框的宽高，N个标注框，尺寸是Nx2,类型是ndarray centroids 聚类点尺寸是 numx2,类型是ndarray ＃按照前后两次的得到的聚类结果是否相同结束循环 ''' def k_means(annotations_array, centroids, eps=0.00005, iterations=200000): # N = annotations_array.shape[0] # C=2 num = centroids.shape[0] # 损失函数 distance_sum_pre = -1 assignments_pre = -1 * np.ones(N, dtype=np.int64) # iteration = 0 # 循环处理 while (True): # iteration += 1 # distances = [] # 循环计算每一个标注框与所有的聚类点的距离（IOU） for i in range(N): distance = 1 - IOU(annotations_array[i], centroids) distances.append(distance) # 列表转换成ndarray distances_array = np.array(distances, np.float32) # 该ndarray的尺寸为 Nxnum # 找出每一个标注框到当前聚类点最近的点 assignments = np.argmin(distances_array, axis=1) # 计算每一行的最小值的位置索引 # 计算距离的总和，相当于k均值聚类的损失函数 distances_sum = np.sum(distances_array) # 计算新的聚类点 centroid_sums = np.zeros(centroids.shape, np.float32) for i in range(N): centroid_sums[assignments[i]] += annotations_array[i] # 计算属于每一聚类类别的和 for j in range(num): centroids[j] = centroid_sums[j] / (np.sum(assignments == j)) # 前后两次的距离变化 diff = abs(distances_sum - distance_sum_pre) # 打印结果 print("iteration: {},distance: {}, diff: {}, avg_IOU: {}\n".format(iteration, distances_sum, diff, np.sum(1 - distances_array) / (N * num))) # 三种情况跳出while循环：1：循环20000次，2：eps计算平均的距离很小 3：以上的情况 if (assignments == assignments_pre).all(): print("按照前后两次的得到的聚类结果是否相同结束循环\n") break if diff < eps: print("按照eps结束循环\n") break if iteration > iterations: print("按照迭代次数结束循环\n") break # 记录上一次迭代 distance_sum_pre = distances_sum assignments_pre = assignments.copy() if __name__ == '__main__': # 聚类点的个数，anchor boxes的个数 num_clusters = args.input_num_anchors # 索引出文件夹中的每一个标注文件的名字(.txt) names = os.listdir(args.input_annotation_txt_dir) # 标注的框的宽和高 annotations_w_h = [] for name in names: txt_path = os.path.join(args.input_annotation_txt_dir, name) # 读取txt文件中的每一行 f = open(txt_path, 'r',encoding="utf-8") for line in f.readlines(): line = line.rstrip('\n') w, h = line.split(' ')[3:] # 这时读到的w,h是字符串类型 # eval()函数用来将字符串转换为数值型 annotations_w_h.append((eval(w), eval(h))) f.close() # 将列表annotations_w_h转换为numpy中的array,尺寸是(N,2),N代表多少框 annotations_array = np.array(annotations_w_h, dtype=np.float32) N = annotations_array.shape[0] # 对于k-means聚类，随机初始化聚类点 random_indices = [random.randrange(N) for i in range(num_clusters)] # 产生随机数 centroids = annotations_array[random_indices] # k-means聚类 k_means(annotations_array, centroids, 0.00005, 200000) # 对centroids按照宽排序，并写入文件 widths = centroids[:, 0] sorted_indices = np.argsort(widths) anchors = centroids[sorted_indices] # 将anchor写入文件并保存 f_anchors = open(args.output_anchors_txt, 'w') # for anchor in anchors: f_anchors.write('%d,%d' % (int(anchor[0] * args.input_cfg_width), int(anchor[1] * args.input_cfg_height))) f_anchors.write('\n') python gen_anchors_kmeans.py --input_annotation_txt_dir=/home/hjz/PycharmProjects/pythonProject/VOC2028/Labels --output_anchors_txt=achors.txt --input_num_anchors=9 --input_cfg_width=640 --input_cfg_height=640 iteration: 189,distance: 2494381.0, diff: 2.75, avg_IOU: 0.23371242911610443 按照前后两次的得到的聚类结果是否相同结束循环 8,18 12,26 19,36 30,52 45,77 68,114 96,175 153,250 287,399

将锚框替换掉我们hat_yolov5s.yaml文件中

#anchors: # - [10,13, 16,30, 33,23] # P3/8 # - [30,61, 62,45, 59,119] # P4/16 # - [116,90, 156,198, 373,326] # P5/32 anchors: - [8,18, 12,26, 19,36] # P3/8 - [30,52, 45,77, 68,114] # P4/16 - [96,175, 153,250, 287,399] # P5/32

3.4.2 train

python train.py --data data/hat.yaml --cfg data/hat_yolov5s.yaml --weights yolov5s.pt --batch-size 32 --epochs 100 Epoch gpu_mem box obj cls total targets img_size 98/99 5.95G 0.03468 0.0506 0.000218 0.08549 846 640: 100%|██████████████████████████████████████████| 171/171 [00:40<00:00, 4.24it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 19/19 [00:03<00:00, 5.64it/s] all 607 2.98e+04 0.792 0.313 0.314 0.2 Epoch gpu_mem box obj cls total targets img_size 99/99 5.95G 0.03429 0.05084 0.0002206 0.08535 1512 640: 100%|██████████████████████████████████████████| 171/171 [00:39<00:00, 4.28it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 19/19 [00:03<00:00, 4.78it/s] all 607 2.98e+04 0.791 0.313 0.313 0.199 Optimizer stripped from runs/exp15/weights/last.pt, 14.8MB Optimizer stripped from runs/exp15/weights/best.pt, 14.8MB 100 epochs completed in 1.214 hours.

3.4.3 test.py

python test.py --weights=/home/hjz/PycharmProjects/pythonProject/01-yolov5-master/runs/exp15/weights/last.pt --data=data/hat.yaml Fusing layers... Model Summary: 140 layers, 7.24922e+06 parameters, 0 gradients Scanning labels /home/hjz/PycharmProjects/pythonProject/custom_data/labels/val.cache (607 found, 0 missing, 0 empty, 607 duplicate, for 607 images): 607it [00:00, 11843.14it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 19/19 [00:04<00:00, 4.13it/s] all 607 2.98e+04 0.768 0.313 0.315 0.199 Speed: 1.2/1.0/2.2 ms inference/NMS/total per 640x640 image at batch-size 32

3.5 yolov5s训练600

594/599 9.41G 0.02903 0.0438 0.0001633 0.07299 2100 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.05it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:02<00:00, 4.39it/s] all 607 2.98e+04 0.805 0.312 0.311 0.195 Epoch gpu_mem box obj cls total targets img_size 595/599 9.41G 0.02927 0.04337 0.0001394 0.07278 2184 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.06it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:02<00:00, 4.38it/s] all 607 2.98e+04 0.804 0.312 0.311 0.195 Epoch gpu_mem box obj cls total targets img_size 596/599 9.41G 0.02892 0.04282 0.000151 0.07189 1752 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.05it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:03<00:00, 4.30it/s] all 607 2.98e+04 0.803 0.312 0.311 0.195 Epoch gpu_mem box obj cls total targets img_size 597/599 9.41G 0.0288 0.0426 0.0001617 0.07156 2142 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.05it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:02<00:00, 4.46it/s] all 607 2.98e+04 0.803 0.311 0.311 0.195 Epoch gpu_mem box obj cls total targets img_size 598/599 9.41G 0.02896 0.04226 0.0001563 0.07137 1836 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.07it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:03<00:00, 4.31it/s] all 607 2.98e+04 0.803 0.311 0.31 0.195 Epoch gpu_mem box obj cls total targets img_size 599/599 9.41G 0.02936 0.04317 0.0002012 0.07273 1827 640: 100%|██████████████████████████████████████████| 114/114 [00:37<00:00, 3.05it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 13/13 [00:03<00:00, 3.40it/s] all 607 2.98e+04 0.803 0.311 0.311 0.195 Optimizer stripped from runs/exp17/weights/last.pt, 14.8MB Optimizer stripped from runs/exp17/weights/best.pt, 14.8MB 600 epochs completed in 6.774 hours.

效果有点差，白帽子会被认为是安全帽，漏检也多。。。

３.6 yolov5x实测(ubuntu16.04)

3.6.1 train.py

同上，建立个新的hat_yolov5x.yaml文件，训练时选择此文件就好

python train.py --data data/hat.yaml --cfg data/hat_yolov5x.yaml --weights yolov5x.ptpt --batch-size 8 --epochs 300 Epoch gpu_mem box obj cls total targets img_size 298/299 8.42G 0.02804 0.04383 0.0002171 0.07209 30 640: 100%|██████████████████████████████████████████| 683/683 [04:02<00:00, 2.82it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 76/76 [00:08<00:00, 8.74it/s] all 607 2.98e+04 0.814 0.315 0.314 0.202 Epoch gpu_mem box obj cls total targets img_size 299/299 8.42G 0.02814 0.04331 0.0001779 0.07163 30 640: 100%|██████████████████████████████████████████| 683/683 [04:02<00:00, 2.82it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 76/76 [00:08<00:00, 8.54it/s] all 607 2.98e+04 0.815 0.315 0.314 0.202 Optimizer stripped from runs/exp5/weights/last.pt, 177.5MB Optimizer stripped from runs/exp5/weights/best.pt, 177.5MB 300 epochs completed in 21.455 hours.

3.6.2 test.py

python test.py --weights=/home/hjz/PycharmProjects/pythonProject/01-yolov5-master/runs/exp5/weights/last.pt --data=data/hat.yaml Fusing layers... Model Summary: 284 layers, 8.83973e+07 parameters, 0 gradients Scanning labels /home/hjz/PycharmProjects/pythonProject/custom_data/labels/val.cache (607 found, 0 missing, 0 empty, 607 duplicate, for 607 images): 607it [00:00, 12405.50it/s] Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|████████████████████████████████| 19/19 [00:08<00:00, 2.26it/s] all 607 2.98e+04 0.812 0.315 0.315 0.202 Speed: 7.9/0.9/8.9 ms inference/NMS/total per 640x640 image at batch-size 32

依然Ｐ高Ｒ低

3.6.3 detect.py

可以看到，这边是Ｐ高Ｒ低de效果，漏检可能会大些。。。

3.7 yolov5再测（windows10）

3.7.1.准备

添加链接描述

上次效果太差了，参考类似项目再训练一下，类别为人，头，安全帽3类

源码地址

首先，下载数据集VOC2028，可以放在项目文件夹下运行detect.py对数据集生成人标签0 注意权重放到对应位置

权重文件

python detect.py --save-txt --source=E:\01_hjz\01_work\pythonProject\Smart_Construction-master\VOC2028\JPEGImages 运行gen_head_helmet.py生成score文件夹训练验证测试划分新建文件夹Labels，运行merge_data.py，把label=0生成到VOC2028label中此时检查score文件夹下label样本是否为0，1，2，是的话，大功告成

3.7.2 训练

1.custom_yaml

train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/] train: ../Smart_Construction-master/score/images/train val: ../Smart_Construction-master/score/images/val # number of classes nc: 3 # class names names: ['person', 'head', 'helmet']

2.anchors

calculate_anchors.py Anchors:[7.77, 15.87] Anchors:[9.21, 20.2] Anchors:[11.5, 23.23] Anchors:[13.82, 28.93] Anchors:[18.51, 35.12] Anchors:[25.6, 44.74] Anchors:[36.0, 61.16] Anchors:[52.8, 89.0] Anchors:[85.33, 147.99] Train_Accuracy:82.27% Ratios:[0.46, 0.48, 0.49, 0.49, 0.53, 0.57, 0.58, 0.59, 0.59] ******************** 1 ********************

3.train

python train.py --img 640 --batch 32 --epochs 100 --data ./data/custom_data.yaml --cfg ./models/custom_yolov5.yaml --weights ./weights/yolov5s.pt

4.test

Class Images Targets P R mAP@.5 mAP@.5:.95: 100%|███████████████████████████████████████████ ██████████| 19/19 [00:29<00:00, 1.55s/it] all 607 1.29e+04 0.897 0.893 0.875 0.611

5.用大神的权重test

python test.py --weights=./weights/helmet_head_person_s.pt --data=./data/custom_data.yaml ██████████| 19/19 [00:30<00:00, 1.60s/it] all 607 1.29e+04 0.862 0.894 0.874 0.589 Speed: 1.4/1.1/2.5 ms inference/NMS/total per 640x640 image at batch-size 32 perfect

6.预测

python detect.py --weights=runs/exp15/weights/best.pt --source=E:\01_hjz\01_work\pythonProject\Smart_Construction-master\inference\int\video 效果很棒

7.摄像头检测

--source=0即可，如果觉得窗口太小，可以在108行cv2.imshow(p, im0)前面加上一行cv2.namedWindow(p, cv2.WINDOW_NORMAL) python detect.py --weights=runs/exp15/weights/last.pt --source=0

最新回复(0)