More and more local (OSM) community are considering deploying a Panoramax instance.
I’m sharing here the important information to consider before moving on regarding legal, technical and investment (human and financial).
Legal
Each country can have its specific law regarding privacy, data protection, photography in public spaces, etc.
Before starting the project in France, we did a legal study job to be sure we were not putting ourselves in danger and to know what can and can’t be done.
For example:
- taking pictures in the public space is ok,
- making them public can only be done after bluring faces and license plates,
- people or intellectual property rights owner on building or artworks present in the public space are not an issue as long as the subject of the picture is no the person or the artwork itself
Privacy / blurring
We trained our own model to detect faces and license plates. As any computer vision model (yes, AI), it is not 100% accurate and can have false negatives. The feedback we had so far is very good but do not expect no false negative (and positives too).
We implemented a reporting link in the viewer which now automatically hides the reported picture, and now anybody can report and hide a picture within a few clicks.
If your country has stricter privacy rules, training a dedicated model is something possible of other kind of things need to be blurred.
Forbidden areas
We also recently implemented “forbidden areas” which allow to define on an instance areas were pictures will be trashed at upload time. In France we have an official list of such areas (some military zones, prisons, nuclear plants, etc).
The same mecanism can be used to accepts pictures only in a given area, for exemple IGN instance will limit its coverage only of the french territory.
License on pictures and derivates
The license under which the pictures can be reused for mapping must be carefully chosen.
Public sector in France uses a license similar to CC-BY (called Licence Ouverte), so this is ok for mapping for anyone including OSM.
OSM-FR community prefers share-alike licenses, so we opted for a “light” CC-BY-SA which need to be explained.
CC-BY-SA makes mandatory that any derivate being under CC-BY-SA too and this is not compatible with OSM’s ODbL license. So we voted to grant an additionnal right to reuse the pictures to produce non photographic derivates under Licence Ouvert (like CC-BY) or ODbL.
Remember that Mapillary is also publishing pictures under CC-BY-SA but also grants an additional right to use them to contribute to OpenStreetMap. We extended that principle to allow also the public sector to generate data that is even “more” open than OSM’s ODbL.
One side effect of the above is that one cannot take pictures from Mapillary and put them in Panoramax without the picture author approval and that’s why peopple who offered to takeout picture and republish them on Panoramax ask for this approval.
We recommend local OSM community to adopt the same “light” CC-BY-SA or a CC-BY.
Panoramax meta-catalog currently only contain pictures available under these license and we plan to keep it like that so that mapping reusers are safe and do not have to check each picture license.
Investment
Financial
Panoramax is mostly a storage challenge.
That’s why we opted for decentralization and instances, but it remains a challenge for each instance.
For example, a rough estimation for a unique dense coverage of France gives something like 100M pictures. With redundancy and a backup we easily reach 1PB of storage.
In most cases, cloud base hosting is very expensive especially on storage and not an option to consider in the long term (except it if is offered… in the long term too !).
OSM-FR choice has been to self-host our own server and storage bay in an sponsored colocation halk-rack we have near Paris. All hardware except the SSDs is donated or second-life (ebay), including the HDDs. That allowed to lower the cost.
The server is a Dell R7910, with 2 GTX 1070 (for the blurring API), 256GB of RAM, 2 4TB NVMe SSDs, and mostly 8TB HDDs (around 40000h old). Total cost: 3500€.
We compensate the risk of old disks failure by using ZFS soft-raid redundancy.
The last disk we bought where less than 10€/TB.
To give an idea of the storage need, we currently have 20M pictures with around 100TB of storage used.
Distributing the storage is something to consider, one way to do it and to investigate/test is using Garage which is a distributed S3 storage. The drawback it that more storage is needed to guarantee redundancy.
GPU are not mandatory to start (see below).
Human
Setting up a server and administering it takes some time and means someone has to do it.
Upgrading the software stack takes some time especially when the code base is moving fast.
A few hours a week is what you should dedicate to maintain an instance (see below).
We can help for the setup, do not hesitate to ask !
Administering a Panoramax instance means administering at least :
- the storage space
- the Panoramax backend API (python based)
- the underlying postgresql database
- the Panoramax frontend with the viewer (js)
The blurring API is an additional part you can deploy or you can start by using the one setup by OSM-FR (we have 2 servers, with a total of 4 GPU and they are far from being overloaded).
A few numbers from the OSM-FR instance as of mid-december 2024:
- 22.7M pictures
- 48.5 TB storage for originals (2.2TB / 1M pictures)
- 22.5 TB for the derivates (1 TB/ 1M picture)
- 157 GB for the postgresql DB (7GB/1M pictures, 114GB once compressed by ZFS)