ultralytics是一个为执行高级计算机视觉任务而创建的Python库,它主要围绕yolo系列模型进行开发。yolo是一个非常流行的实时目标检测算法,它能够兼顾速度和质量,而ultralytics yolov8是基于yolo系列的一个前沿版本,它引入了新功能和改进,进一步提升了性能和灵活性。今天,我们考虑借助ultralytics和opencv-python库来简单的实现一些目标检测任务。
环境准备:
ultralytics
opencv-python
这是需要安装的库,python内置库就不指出了。
当然,只有这些库还是不够的,虽然我们使用的是ultralytics中已经封装好的yolo,但我们仍然需要将已经训练好的权重文件导入,文件地址:
https://github.com/ultralytics/ultralytics?tab=readme-ov-file
找到这个位置:
我是用的是yolov8n, 当然你也可以使用其它的。
注:直接点击蓝色下划线字体就可下载相应文件。n、s、m、l、x 代表了模型的规模,这些模型在性能上有所不同,但都可以用于目标检测任务。模型规模越大,通常性能越好,但推理时间也可能更长。
代码如下:
from ultralytics import YOLO
import cv2
import math
import time
cap = cv2.VideoCapture(1) # For Webcam
cap = cv2.VideoCapture("bikes.mp4") # For Video
model = YOLO("yolov8n.pt")
classNames = ["person", "bicycle", "car", "motorbike", "aeroplane", "bus", "train", "truck", "boat",
"traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat",
"dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella",
"handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat",
"baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup",
"fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli",
"carrot", "hot dog", "pizza", "donut", "cake", "chair", "sofa", "pottedplant", "bed",
"diningtable", "toilet", "tvmonitor", "laptop", "mouse", "remote", "keyboard", "cell phone",
"microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors",
"teddy bear", "hair drier", "toothbrush"
]
prev_frame_time = 0
new_frame_time = 0
while True:
new_frame_time = time.time()
success, img = cap.read()
results = model(img, stream=True)
for r in results:
boxes = r.boxes
for box in boxes:
# Bounding Box
x1, y1, x2, y2 = box.xyxy[0]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)
cv2.rectangle(img,(x1,y1),(x2,y2),(255,0,255),3)
conf = math.ceil((box.conf[0] * 100)) / 100
cls = int(box.cls[0])
image=cv2.putText(img, f'{classNames[cls]} {conf}',(max(0,x1),max(35, y1)),0, 0.75, (255, 0, 255), 2)
fps = 1/(new_frame_time - prev_frame_time)
prev_frame_time = new_frame_time
print(fps)
cv2.imshow("Image", img)
cv2.waitKey(1)
代码逻辑很简单,基本就是导入yolo模型,然后识别视频中的对象,并给检测到的对象画框,注明类别。一旦得到对象的信息,我们借助opencv-python中的函数就能很容易的把框和类别画出来。
下面视频显示的是效果:
可以看出,效果还是相当不错。
在这里,我们使用封装好的yolo,很容易的就实现了目标检测的任务,同时帧率也得到了保证。事实上,我们也还可以实现一些其它的功能,这将会在以后的文章中一一写出。