Rigid AprilTag Bundle Tracking

Accurate AprilGroup Tracking: pose estimation + optical flow + calibration workflows
A research-heavy robotics project focused on improving AprilTag pose tracking accuracy beyond “plain solvePnP”, then using those poses for real-time calibration (including pen-tip calibration).
Instead of treating pose estimation as a one-shot calculation, this project explores a more robust pipeline: better initial guesses, tracking between frames, outlier rejection, and tighter calibration constraints, with careful attention to real-world noise (lighting, motion blur, occlusions, and camera quirks).
- What this project is
- Why it mattered
- Tech + environment
- High-level pipeline
- Milestones (what I actually built)
- What went wrong (and how I handled it)
- What I learned
What this project is
This system tracks AprilTags arranged as an “AprilGroup” (a rigid object with multiple tags), estimates the object pose in real time, and optionally uses those poses to calibrate a pen tip. The goal is stable, accurate pose estimation that holds up under motion and imperfect conditions.
Why it mattered
Typical marker pipelines can be fragile:
- jittery pose estimates
- sudden flips when tags are partially visible
- drift when the camera moves quickly
- outliers poisoning calibration datasets
So the work here was about building a pipeline that’s not just mathematically correct, but operationally reliable.
Tech + environment
Languages: Python
Core libs: OpenCV, AprilTag, NumPy (plus the repo requirements)
Platform: Ubuntu (tested)
Hardware used: USB camera, chessboard for calibration, AprilTags, calibrated dodecahedron “AprilGroup” (rigid multi-tag target)
High-level pipeline
- Camera calibration
- Capture chessboard images
- Compute intrinsics and store them for reuse
- Tag detection
- Detect multiple tags per frame
- Pose estimation (baseline)
- solvePnP with detected corners
- Pose improvement / stabilization
- Better initial guesses (enhanced APE)
- Optical flow between frames for more tracked points
- Outlier rejection (OpenCV method or velocity-vector method)
- Calibration workflows
- Pen-tip calibration using collected pose data under constraints
Milestones (what I actually built)
1) “Make it run cleanly” (build + reproducibility)
This project has a real-world setup burden: multiple native libs + Python deps.
What I did:
- documented the system dependencies clearly (Ubuntu, OpenCV, AprilTag, requirements.txt)
- created a consistent install path + folder structure
- added CLI flags so functionality can be enabled/disabled without editing code
What I learned:
- reproducibility is a feature. Your research code becomes “real” when someone else can run it.
2) Camera calibration + data paths
Calibration is the foundation. Bad intrinsics = everything downstream lies.
What I did:
- automated the “use existing intrinsics if found, otherwise calibrate” flow
- enforced a simple workflow: drop chessboard images into a folder, run main
How I adapted:
- treated calibration as a first-class step rather than an afterthought
- wrote notes on lighting/motion blur impact to reduce garbage inputs
3) Multi-tag pose estimation (AprilGroup concept)
A single tag can fail. Multiple tags gives robustness.
What I did:
- built support for using multiple detected tag IDs per frame
- structured the project so the “group” geometry can be provided as JSON
Constraint handled:
- the calibrated dodecahedron geometry JSON is omitted for confidentiality, so the code needed to still be understandable without it.
4) Optical flow tracking + outlier rejection
Optical flow increases point density and improves continuity between frames, but it also adds new failure modes.
What I did:
- integrated Lucas-Kanade pyramidal optical flow (toggleable via flag)
- supported different outlier rejection strategies:
- OpenCV outlier method
- velocity-vector method (based on referenced research approach)
What I learned:
- when you add a stabilizer (optical flow), you also add new types of wrong.
- outlier detection isn’t optional. It’s the seatbelt.
5) Pen-tip calibration workflows
Once pose is stable, calibration becomes usable.
What I did:
- implemented pen-tip calibration routines (Algebraic One / Two Step)
- added strict filtering guidance:
- mean reprojection error < 1
- only accept frames when ≥ 3 tags are detected
- disable optical flow for tighter constraints (when needed)
What I learned:
- good calibration is about data quality gates, not just math.
What went wrong (and how I handled it)
Motion blur + lighting variability
Symptoms:
- pose jitter, inconsistent tag detection, occasional pose flips
Fixes / mitigation:
- added tighter data capture guidance
- constrained calibration dataset acceptance
- emphasized good capture conditions and filtering
Library linking + environment complexity
Symptoms:
- OpenCV / apriltag imports failing depending on environment paths
Fixes:
- explicit linking steps documented for venv usage
- made setup notes pragmatic (“these were my paths, yours may differ”)
- kept requirements in
requirements.txtand installation steps grouped
“It works, but it’s not stable”
Symptoms:
- baseline solvePnP works but is noisy frame-to-frame
Fixes:
- improved the initial guess approach (enhanced APE option)
- optical flow for continuity
- outlier rejection strategies selectable via CLI
What I learned
- Applied computer vision pragmatically: calibration, pose estimation, optical flow, outlier filtering
- Built “research code” like production code: flags, logging, clear setup steps, reproducible run path
- Made tradeoffs intentionally: stability vs sensitivity, tight constraints vs “more data”
- Debugged real-world CV issues: motion blur, lighting, drift, and noisy detections
- Designed for future extension: modular steps, configurable methods, swappable outlier strategies