Transformer-Based - 分类 - Zhaoylee's Blogs

Iter3DDet: Depth Guided Iterative Fusion and Refinement for Monocular 3D Object Detection

zhaoylee — Sat, 04 Apr 2026 12:24:33 +0800

博客的简述

StreamPETR-QAF2D：Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

zhaoylee — Sun, 15 Mar 2026 21:59:16 +0800

🏛️ 会议/期刊：CVPR
📅 发表年份：2024
💻 开源代码：nullmax-vision/QAF2D-CVPR 2024
📄 论文题目：Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

这篇发表于 CVPR 2024 的论文 《Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors》(简称 QAF2D) 极具工程实用价值。它没有死磕 3D 空间中的特征提取瓶颈，而是打出了一套极其聪明的“降维组合拳”，巧妙地利用成熟的 2D 视觉技术来为 3D 检测器“引路”。

PLOT: Pseudo-Labeling via Object Tracking for Monocular 3D Object Detection

zhaoylee — Sun, 15 Mar 2026 20:52:51 +0800

🏛️ 会议/期刊：ICLR
📅 发表年份：2026
💻 开源代码：无
📄 论文题目：PLOT: Pseudo-Labeling via Object Tracking for Monocular 3D Object Detection

一、背景、研究目的与核心问题

研究背景： 单目 3D 目标检测模型极度“吃数据”。然而，人工标注 3D 边界框极其昂贵且耗时，导致目前带 3D 标签的数据集规模很小，严重限制了模型的泛化能力。

Difficulty-Aware Label-Guided Denoising for Monocular 3D Object Detection

zhaoylee — Sun, 15 Mar 2026 20:52:49 +0800

🏛️ 会议/期刊：AAAI
📅 发表年份：2026
💻 开源代码：MonoDLGD
📄 论文题目：Difficulty-Aware Label-Guided Denoising for Monocular 3D Object Detection

一、背景、研究目的与核心问题

研究背景： 在基于 Transformer 的单目 3D 目标检测中，通过向真实标签注入噪声并让模型去重构（即查询去噪 Query Denoising），能有效加速模型收敛并提升几何感知能力。

Mono3DV: Monocular 3D Object Detection with 3D-Aware Bipartite Matching and Variational Query DeNoising

zhaoylee — Sun, 15 Mar 2026 20:36:42 +0800

🏛️ 会议/期刊：CVPR / ICCV / ECCV
📅 发表年份：2026
💻 开源代码：无
📄 论文题目：Mono3DV: Monocular 3D Object Detection with 3D-Aware Bipartite Matching and Variational Query DeNoising

一、背景、研究目的与核心问题

研究背景： 近年来，基于 Transformer（特别是 DETR 架构）的模型在 2D 目标检测中取得了巨大成功，并顺理成章地被引入到单目 3D 目标检测（M3OD）领域。这类模型依赖“查询（Query）”机制和“二分图匹配（Bipartite Matching）”来端到端地输出检测结果，无需繁琐的非极大值抑制（NMS）。

SPAN: Spatial-Projection Alignment for Monocular 3D Object Detection

zhaoylee — Sun, 15 Mar 2026 19:38:47 +0800

🏛️ 会议/期刊：CVPR / ICCV / ECCV
📅 发表年份：2026
💻 开源代码：GitHub 链接
📄 论文题目：SPAN: Spatial-Projection Alignment for Monocular 3D Object Detection

1. 文献背景与研究动机

背景与现状

单目3D目标检测（Monocular 3D Object Detection）是自动驾驶和机器人视觉中的核心任务，旨在仅通过单张RGB图像预测物体的3D边界框。

Transformer-Based - 分类 - Zhaoylee's Blogs

Iter3DDet: Depth Guided Iterative Fusion and Refinement for Monocular 3D Object Detection

StreamPETR-QAF2D：Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

PLOT: Pseudo-Labeling via Object Tracking for Monocular 3D Object Detection

一、 背景、研究目的与核心问题

Difficulty-Aware Label-Guided Denoising for Monocular 3D Object Detection

一、 背景、研究目的与核心问题

Mono3DV: Monocular 3D Object Detection with 3D-Aware Bipartite Matching and Variational Query DeNoising

一、 背景、研究目的与核心问题

SPAN: Spatial-Projection Alignment for Monocular 3D Object Detection

1. 文献背景与研究动机

背景与现状

一、背景、研究目的与核心问题

一、背景、研究目的与核心问题

一、背景、研究目的与核心问题