Deploying a Panoramax instance : the pre-flight check list

More and more local (OSM) community are considering deploying a Panoramax instance.

I’m sharing here the important information to consider before moving on regarding legal, technical and investment (human and financial).

Legal

Each country can have its specific law regarding privacy, data protection, photography in public spaces, etc.

Before starting the project in France, we did a legal study job to be sure we were not putting ourselves in danger and to know what can and can’t be done.

For example:

  • taking pictures in the public space is ok,
  • making them public can only be done after bluring faces and license plates,
  • people or intellectual property rights owner on building or artworks present in the public space are not an issue as long as the subject of the picture is no the person or the artwork itself

Privacy / blurring

We trained our own model to detect faces and license plates. As any computer vision model (yes, AI), it is not 100% accurate and can have false negatives. The feedback we had so far is very good but do not expect no false negative (and positives too).

We implemented a reporting link in the viewer which now automatically hides the reported picture, and now anybody can report and hide a picture within a few clicks.

If your country has stricter privacy rules, training a dedicated model is something possible of other kind of things need to be blurred.

Forbidden areas

We also recently implemented “forbidden areas” which allow to define on an instance areas were pictures will be trashed at upload time. In France we have an official list of such areas (some military zones, prisons, nuclear plants, etc).

The same mecanism can be used to accepts pictures only in a given area, for exemple IGN instance will limit its coverage only of the french territory.

License on pictures and derivates

The license under which the pictures can be reused for mapping must be carefully chosen.

Public sector in France uses a license similar to CC-BY (called Licence Ouverte), so this is ok for mapping for anyone including OSM.

OSM-FR community prefers share-alike licenses, so we opted for a “light” CC-BY-SA which need to be explained.
CC-BY-SA makes mandatory that any derivate being under CC-BY-SA too and this is not compatible with OSM’s ODbL license. So we voted to grant an additionnal right to reuse the pictures to produce non photographic derivates under Licence Ouvert (like CC-BY) or ODbL.

Remember that Mapillary is also publishing pictures under CC-BY-SA but also grants an additional right to use them to contribute to OpenStreetMap. We extended that principle to allow also the public sector to generate data that is even “more” open than OSM’s ODbL.

One side effect of the above is that one cannot take pictures from Mapillary and put them in Panoramax without the picture author approval and that’s why peopple who offered to takeout picture and republish them on Panoramax ask for this approval.

We recommend local OSM community to adopt the same “light” CC-BY-SA or a CC-BY.

Panoramax meta-catalog currently only contain pictures available under these license and we plan to keep it like that so that mapping reusers are safe and do not have to check each picture license.

Investment

Financial

Panoramax is mostly a storage challenge.

That’s why we opted for decentralization and instances, but it remains a challenge for each instance.

For example, a rough estimation for a unique dense coverage of France gives something like 100M pictures. With redundancy and a backup we easily reach 1PB of storage.

In most cases, cloud base hosting is very expensive especially on storage and not an option to consider in the long term (except it if is offered… in the long term too !).

OSM-FR choice has been to self-host our own server and storage bay in an sponsored colocation halk-rack we have near Paris. All hardware except the SSDs is donated or second-life (ebay), including the HDDs. That allowed to lower the cost.
The server is a Dell R7910, with 2 GTX 1070 (for the blurring API), 256GB of RAM, 2 4TB NVMe SSDs, and mostly 8TB HDDs (around 40000h old). Total cost: 3500€.
We compensate the risk of old disks failure by using ZFS soft-raid redundancy.
The last disk we bought where less than 10€/TB.

To give an idea of the storage need, we currently have 20M pictures with around 100TB of storage used.

Distributing the storage is something to consider, one way to do it and to investigate/test is using Garage which is a distributed S3 storage. The drawback it that more storage is needed to guarantee redundancy.

GPU are not mandatory to start (see below).

Human

Setting up a server and administering it takes some time and means someone has to do it.
Upgrading the software stack takes some time especially when the code base is moving fast.

A few hours a week is what you should dedicate to maintain an instance (see below).

We can help for the setup, do not hesitate to ask !

Administering a Panoramax instance means administering at least :

  • the storage space
  • the Panoramax backend API (python based)
  • the underlying postgresql database
  • the Panoramax frontend with the viewer (js)

The blurring API is an additional part you can deploy or you can start by using the one setup by OSM-FR (we have 2 servers, with a total of 4 GPU and they are far from being overloaded).


A few numbers from the OSM-FR instance as of mid-december 2024:

  • 22.7M pictures
  • 48.5 TB storage for originals (2.2TB / 1M pictures)
  • 22.5 TB for the derivates (1 TB/ 1M picture)
  • 157 GB for the postgresql DB (7GB/1M pictures, 114GB once compressed by ZFS)
6 Likes

Thank you for this info.
OpenStreetMap Croatia community is considering running our own instance, but we were not aware what hardware requirements are, except for the storage capacity :slight_smile:
What is the general CPU/RAM usage on the instance? Just to know how to setup the server. I have some spare Tesla GPUs so we can use them for blurring.
Every recommendation is welcome.

1 Like

CPU is used by:

  • postgresql
  • the API backend
  • the workers that create derivates from the original (blurred) pictures (thumbnails, low resolution, tiled version of 360 pictures)

RAM is mostly used by postgresql and the workers… and disk cache.

If you want to run the blurring API, CPU and RAM is also used at that level.

You can see the Munin graphs of OSM-FR Panoramax server here: osm37.openstreetmap.fr (Munin :: osm37.openstreetmap.fr :: osm37.openstreetmap.fr)

On this server we have:

  • OSM-FR instance (API backend + the PG database + frontend)
  • a blurring API
  • the meta-catalog (another postgresql)
  • additionnal tools: matomo, weblate

As you can see, the CPU graph shows that we use 3-4 cores on average for all these services.
As we have 256GB of RAM, we use it a lot as disk cache for ZFS (half of it).

I think 64GB and 8 cores should be enough for the backend itself (PG included).

1 Like

On IGN instance here are a few numbers about the allocated RAM :

  • python backend container has 4GB or RAM
  • the postgresql DB has 8GB
  • the workers are running on a VM which is way too large (64GB and 16 cores)

We’re planning to move this to a more self-managed hosting to have a better and more direct control.

1 Like