End-to-end object detection is a design paradigm for object detection. Starting from inputting the original image, the model directly outputs the final detection results through the forward calculation of a single network, without independent modules such as manually designed candidate region extraction and multi-stage branches. The YOLO series is a typical end-to-end detection model, which simplifies the training and inference processes.



