Obstacle detection: improved YOLOX-S based on swin transformer-tiny

ZHANG X, ZHOU M, QIU P, et al. Radar and vision fusion for real-time obstacle detection and identification[J]. Industrial robot: the international journal of robotics research and application, 2019, 46(3): 391–395.

Article Google Scholar

REDMON J, FARHADI A. Yolov3: an incremental improvement[EB/OL]. (2018-04-08) [2023-01-22]. https://arxiv.org/abs/1804.02767.

BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: optimal speed and accuracy of object detection[EB/OL]. (2020-06-05) [2023-01-22]. https://github.com/kiccho1101/paper/issues/27.

HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904–1916.

Article Google Scholar

LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 18–22, 2018, Salt Lake City, USA. IEEE: New York, 2018: 8759–8768.

Google Scholar

GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. (2021-08-06) [2023-01-22]. https://arxiv.org/abs/2107.08430.

JOCHER G, STOKEN A, BOROVEC J, et al. Ultralytics/YOLOv5: v5.0-YOLOv5-P6 1280 models AWS supervisely and youtube integrations[J]. Zenodo, 2021, 11.

WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, June 14–19, 2020, Seattle, WA, USA. IEEE: New York, 2020: 390–391.

Google Scholar

VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.

DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[EB/OL]. (2010-11-09) [2023-01-22]. https://arxiv.org/pdf/2010.11929.pdf.

LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10–17, 2021, Montreal, Canada. IEEE: New York, 2021: 10012–10022.

Google Scholar

GRIGORESCU S, TRASNEA B, COCIAS T, et al. A survey of deep learning techniques for autonomous driving[J]. Journal of field robotics, 2020, 37(3): 362–386.

Article Google Scholar

HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, October 24–27, 2017, Italy. IEEE: New York, 2017: 2961–2969.

Google Scholar

LIN G, MILAN A, SHEN C, et al. Refinenet: multipath refinement networks for high-resolution semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, July 21–26, 2017, Hawaii, USA. IEEE: New York, 2017: 1925–1934.

Google Scholar

REN S, HE K, GIRSHICK R, et al. Faster R-CNN towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28.

LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: common objects in context[C]//European Conference on Computer Vision, September 6–12, 2014, Zurich, Switzerland. Berlin: Springer, Cham, 2014: 740–755.

Google Scholar

JOCHER G, KWON Y, VEITCH-MICHAELIS J, et al. Ultralytics/YOLOv3: v9.5.0-YOLOv5 v5.0 release compatibility update for yolov3[J]. Zenodo, 2021.

ZHANG H, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. (2017-10-25) [2023-01-22]. https://arxiv.org/abs/1710.09412.

GE Z, LIU S T, LI Z M, et al. OTA: optimal transport assignment for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 20–25, 2021, virtual. IEEE: New York, 2021.

Google Scholar

KNIGHT P A. The sinkhorn-knopp algorithm: convergence and applications[J]. SIAM journal on matrix analysis and applications, 2008, 30(1): 261–275.

Article MathSciNet MATH Google Scholar

LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//14th European Conference on Computer Vision, October 11–14, 2016, Amsterdam, the Netherlands. Berlin: Springer International Publishing, 2016: 21–37.

Google Scholar

LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, October 24–27, 2017, Italy. IEEE: New York, 2017: 2980–2988.

Google Scholar

CAI Z, VASCONCELOS N. Cascade R-CNN: high quality object detection and instance segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 43(5): 1483–1498.

Article Google Scholar

LAW H, DENG J. Cornernet: detecting objects as paired keypoints[C]//Proceedings of the European Conference on Computer Vision, September 8–14, 2018, Munich, Germany. Berlin: Springer International Publishing, 2018: 734–750.

Google Scholar

LU X, LI B, YUE Y, et al. Grid R-CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15–20, 2019, Long Beach, USA. IEEE: New York, 2019: 7363–7372.

Google Scholar

SUN P, ZHANG R, JIANG Y, et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 20–25, 2021, virtual. IEEE: New York, 2021: 14454–14463.

Google Scholar

View original article

OPTOELECTRONICS LETTERS

Like

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Obstacle detection: improved YOLOX-S based on swin transformer-tiny

Comments (0)