Annotation for LiDAR vs Imagery: What's Actually Different

The headline difference

Imagery annotation works on 2D pixels. The annotator sees an image, draws geometry on it, the geometry references pixel coordinates that get back-projected to real-world coordinates via camera calibration. Standard tools (CVAT, Labelbox, Roboflow) handle this natively.

LiDAR annotation works on 3D point clouds. The annotator sees a cloud of points in space (usually rendered from multiple angles), draws geometry that exists in 3D, the geometry references real-world XYZ coordinates directly. Tooling is more specialized; the workflow is slower; the data volume per frame is much larger.

Most projects we run are imagery-based. The LiDAR projects we do tend to be specialist work — autonomous vehicle ground truth, infrastructure surveys with mobile mapping rigs, drone photogrammetry validation. This article covers the practical differences so you can decide which fits your use case.

What changes in the workflow

Visualization is harder

Imagery annotation: open a frame, see what's in it, draw boxes. The frame is small enough to render fast and intuitive enough to understand at a glance.

LiDAR annotation: open a point cloud (millions of points per scene), figure out what you're looking at, navigate in 3D, then draw geometry. The mental load of orienting yourself in a point cloud is substantially higher than a flat image. Annotators usually use multiple views simultaneously — a 3D perspective, a top-down ortho view, side profile views — to understand a single feature.

Annotator throughput on LiDAR is typically 30-60% of imagery throughput on equivalent feature density. Costs reflect this.

Geometry is 3D

An imagery bounding box is four corners. A LiDAR bounding box is eight corners (3D box) or a polyline in 3D space for things like lane lines.

For some asset types, the 3D geometry is the whole point — a utility pole isn't just a 2D point on the ground, it's a vertical line from base to top with attached equipment at known heights. For other asset types, the third dimension is incidental — a stop sign cares about its position on the ground and its mounting height, but not its 3D outline.

Tooling needs to support 3D primitives natively. CVAT has 3D support; we extend it for our LiDAR projects.

Coordinate system handling is stricter

Imagery annotation can get away with pixel coordinates and only project to world coordinates at delivery. LiDAR annotation works in world coordinates throughout — the points come in with XYZ values relative to some reference frame.

This means the project needs to lock the coordinate reference frame up front (UTM zone, datum, units), and the annotator's view needs to render those coordinates correctly. Mistakes here are silent and corrosive — annotations placed in the wrong CRS look fine in their own coordinate system but don't align with anything else.

Classification often requires reflectance

LiDAR points carry a reflectance value (how strongly the pulse came back). Different surfaces have different reflectance signatures — pavement is low-reflectance, road striping is high-reflectance, vegetation is variable, metal is high-reflectance.

Many LiDAR classification workflows use reflectance as a primary signal alongside geometry. Imagery doesn't have a direct analog — color and texture serve a similar role but less reliably. Annotation tooling for LiDAR needs to expose reflectance as a visualization channel; without it, lane line annotation in particular is much harder.

What's similar

Several things, importantly.

Schema discipline is the same

Schema lock, edge case documentation, class definitions, attribute templates — same discipline applies. The fact that you're working in 3D doesn't reduce the importance of agreeing what each class is before annotation starts.

QA pass is the same shape

Per-class accuracy targets, peer review, senior spatial QA against authoritative layers. Same three-pass model. The authoritative layer cross-reference is sometimes easier with LiDAR (you can validate against survey-grade control directly) and sometimes harder (3D ground truth is rarer than 2D).

Deliverable formats overlap

GeoJSON works for both (2D features extracted from LiDAR can ship as GeoJSON; 3D features ship with elevation in z-coordinate). LAS/LAZ is LiDAR-specific. KITTI works for both with subtle format differences. We deliver in whatever your pipeline ingests.

When LiDAR is worth it

Three scenarios where LiDAR pays off over imagery.

You need vertical precision. LiDAR vertical accuracy from mobile mapping rigs is typically 5-15 cm; from drone surveys with ground control, 10-30 cm; from aerial LiDAR, 15-50 cm. Imagery-derived elevations are an order of magnitude worse unless you've done dense photogrammetry with surveyed control.

Your features are partly occluded. A 360° LiDAR sees behind vehicles and around objects that block a forward-facing camera. Asset inventories from LiDAR have higher completeness on dense urban networks than equivalent imagery captures.

You're training a 3D model. Autonomous vehicle perception, drone obstacle avoidance, infrastructure-clash detection. These models consume 3D data natively; converting from imagery loses information.

When imagery is fine

More scenarios than people realize.

2D asset inventories. State DOT sign inventories, billboard catalogs, urban tree surveys. The vertical dimension isn't load-bearing for the use case; imagery captures the visible features fine.

Quick coverage of a wide area. Aerial imagery covers thousands of square miles per flight; aerial LiDAR captures less area per flight and costs more per square mile. For surveys where high vertical accuracy isn't required, imagery wins on cost per unit area.

Training perception models that take RGB input. Almost all camera-based AV perception, most drone-based monitoring, most CCTV-derived analytics. The model takes images, the training data should be images.

Cost comparison, ballpark

Per-feature annotation costs roughly:

Generic image labeling: $0.05-$0.30/feature
GIS-native image labeling: $0.40-$2.50/feature
LiDAR point cloud labeling: $1.50-$8.00/feature

The LiDAR premium is the workflow cost (slower throughput) and tooling cost (specialized software, more compute to render). Per-feature costs for both vary widely with density (sparse features cost more per unit because of setup overhead), schema complexity, and accuracy requirements.

Hybrid: LiDAR + imagery together

More and more projects we do are hybrid — the capture rig collected both LiDAR and synchronized RGB imagery, and we annotate using both sources together. The combined workflow is more powerful than either alone:

LiDAR provides 3D geometry and reflectance
Imagery provides color, texture, and human-readable detail
The annotator can flip between views without losing context

Hybrid annotation costs a bit more than pure LiDAR (extra views to manage) but is meaningfully higher quality than pure LiDAR or pure imagery on infrastructure projects. If your capture rig collected both, ask the vendor to use both.

Got LiDAR data and not sure what to do with it? Send a representative cloud (LAS, LAZ, or a sample frame) and a description of what you want labeled. We'll come back with workflow recommendations and per-feature pricing.