Address data for new countries – required formats and structure

Several users from other countries have asked how they can help us build address data for their region. This post describes what kind of data we need, how it should be structured, and in which format it can be provided. It also explains how you can support us by pointing to suitable public data sources that we can review and use.

General structure

We need address data that covers the entire country and reflects the following hierarchical structure:

Country → Region or State → Municipality or Administrative District → City → District or Subdivision → Street

Each street must be clearly linked to its higher-level administrative units. The data should be as complete, consistent, and geographically accurate as possible.

File format

We can process various data formats, but CSV files are the simplest.
Files should:

  • be encoded in UTF-8
  • use either comma (,) or semicolon (;) as a delimiter
  • include a header row with column names

The CSV format is given as an example. We can also convert other formats (for example Excel, GeoJSON, Shapefile) as long as the contained data matches the structure and quality described here.

Column structure

The following columns form the basis. Required fields are necessary for processing, optional fields improve accuracy but are not mandatory.

Column Description Example Required
Country Name or ISO code of the country France Yes
State Name of the region or state Île-de-France Yes
Commune Name of the municipality or administrative area Paris Yes
City Name of the city Paris Yes
District Name of the district or subdivision 1er Arrondissement Yes
Zip Postal code 75001 Yes
Street Street name Rue de Rivoli Yes
StreetNumberMin Lowest house number 1 Optional
StreetNumberMax Highest house number 99 Optional
DistrictLat Latitude of the district in decimal degrees 48.85661 Yes
DistrictLng Longitude of the district in decimal degrees 2.35222 Yes
PhonePrefix Telephone prefix 01 Optional

Example dataset (comma separated):

Country,State,Commune,City,District,Zip,Street,StreetNumberMin,StreetNumberMax,DistrictLat,DistrictLng,PhonePrefix
France,Île-de-France,Paris,Paris,1er Arrondissement,75001,Rue de Rivoli,1,99,48.85661,2.35222,+33 1
Netherlands,Noord-Holland,Amsterdam,Amsterdam,Centrum,1012,Damrak,1,200,52.37312,4.89222,+31 20

Formatting rules

  • Text must be UTF-8 encoded
  • Decimal separator for coordinates must be a period (.), not a comma
  • Postal codes can be alphanumeric, leading zeros must be preserved
  • Each combination of country, district, and street should be unique
  • Empty values are allowed where specific information (such as house numbers) is not available

Data limitations

  • A street can only exist once within the same district (street name must be unique per district)
  • A district name and ZIP code combination must be unique within the same city.
  • Each address record must belong to an existing district, and each district must belong to a valid city, commune, state, and country (foreign key hierarchy must be valid).
  • Latitude and longitude are required for every district and serve as the fallback coordinates when no street-level data is available.
  • Streets that cross district or postal boundaries must be created as separate entries for each district or ZIP area.
  • Empty or duplicate key values are not allowed in the hierarchy (e.g. missing district references or repeated street names in the same district).

Coordinates and fallback behavior

Normally, address coordinates are resolved dynamically down to street or even house number level.

If this resolution is not possible for a specific area, the system falls back to the coordinates of the corresponding district.

For this reason, the latitude and longitude values for each district must be accurate and represent the geographic center of that district.

If these coordinates are imprecise, locations may be displayed incorrectly. The district coordinates are the lowest fallback level used when no more detailed coordinates are available.

Support through data sources

In addition to providing structured datasets directly, it is especially helpful if you can point us to official or open data sources that contain address information for entire countries or regions.

Suitable examples include:

  • national or regional address registers
  • municipal or governmental geodata portals
  • open administrative databases (open data)
  • freely available geodata such as OpenStreetMap

We will review these sources and prepare the data for integration into our system.

Goal

With this information, we can build complete address data packages for additional countries. Any contribution, whether in the form of structured data or references to public sources, helps us expand coverage more quickly.

If you have datasets or know of suitable sources that follow or approximate this structure, please share them here in the forum. We will review each submission and determine how it can be integrated.

Ich habe die Dateien an die EMail [email protected] geschickt.

Ist das so in Ordnung? Kommt das dann an der richtigen Stelle an? :slight_smile:

Vielen Dank. Wir haben die Daten hinzugefügt. Die Daten für Luxembourg sollten somit verfügbar sein, wenn das Adress-Paket ausgewählt ist.

Bitte testet die Daten ausgiebig. Sollte der Import falsch sein, müssen wir das korrigieren. Wenn wir später die Datensätze entfernen und neu einfügen, würden alle POI die Adressen wieder verlieren.

Hello, I have a database that include coordinates for every house number. How can i use this ? Should i choose one random number ?