Visual Positionning is a technique that uses one (or more) pictures and surrounding geolocated data to determine the location from which the picture was taken.
I found at least 2 open research work on the topic that could be used with Panoramax and openStreetMap data and allow to get a more accurate positionning than GPS alone giving us a way to find a more accurate geolocation for pictures especially in urban area where the GPS signals can be affected by buildings.
OrienterNet (2023) :
OSMloc (2024) :
MaplocNet
I’ve not looked into the detail but the process is globally the following:
- depth estimation on the picture, with optional semantic segmentation to determine the type of objets
- get surrouding OSM vector data, and build a virtual 3D twin
- match both to find the camera location and heading
Here is OrenterNET :
and OSMloc:
One (or two) more thing to test !!
I’ll start with OSMloc… which seems to provide better results.

