Several users from other countries have asked how they can help us build address data for their region. This post describes what kind of data we need, how it should be structured, and in which format it can be provided. It also explains how you can support us by pointing to suitable public data sources that we can review and use.
General structure
We need address data that covers the entire country and reflects the following hierarchical structure:
Country → Region or State → Municipality or Administrative District → City → District or Subdivision → Street
Each street must be clearly linked to its higher-level administrative units. The data should be as complete, consistent, and geographically accurate as possible.
File format
We can process various data formats, but CSV files are the simplest.
Files should:
- be encoded in UTF-8
- use either comma (
,) or semicolon (;) as a delimiter - include a header row with column names
The CSV format is given as an example. We can also convert other formats (for example Excel, GeoJSON, Shapefile) as long as the contained data matches the structure and quality described here.
Column structure
The following columns form the basis. Required fields are necessary for processing, optional fields improve accuracy but are not mandatory.
| Column | Description | Example | Required |
|---|---|---|---|
| Country | Name or ISO code of the country | France | Yes |
| State | Name of the region or state | Île-de-France | Yes |
| Commune | Name of the municipality or administrative area | Paris | Yes |
| City | Name of the city | Paris | Yes |
| District | Name of the district or subdivision | 1er Arrondissement | Yes |
| Zip | Postal code | 75001 | Yes |
| Street | Street name | Rue de Rivoli | Yes |
| StreetNumberMin | Lowest house number | 1 | Optional |
| StreetNumberMax | Highest house number | 99 | Optional |
| DistrictLat | Latitude of the district in decimal degrees | 48.85661 | Yes |
| DistrictLng | Longitude of the district in decimal degrees | 2.35222 | Yes |
| PhonePrefix | Telephone prefix | 01 | Optional |
Example dataset (comma separated):
Country,State,Commune,City,District,Zip,Street,StreetNumberMin,StreetNumberMax,DistrictLat,DistrictLng,PhonePrefix
France,Île-de-France,Paris,Paris,1er Arrondissement,75001,Rue de Rivoli,1,99,48.85661,2.35222,+33 1
Netherlands,Noord-Holland,Amsterdam,Amsterdam,Centrum,1012,Damrak,1,200,52.37312,4.89222,+31 20
Formatting rules
- Text must be UTF-8 encoded
- Decimal separator for coordinates must be a period (
.), not a comma - Postal codes can be alphanumeric, leading zeros must be preserved
- Each combination of country, district, and street should be unique
- Empty values are allowed where specific information (such as house numbers) is not available
Data limitations
- A street can only exist once within the same district (street name must be unique per district)
- A district name and ZIP code combination must be unique within the same city.
- Each address record must belong to an existing district, and each district must belong to a valid city, commune, state, and country (foreign key hierarchy must be valid).
- Latitude and longitude are required for every district and serve as the fallback coordinates when no street-level data is available.
- Streets that cross district or postal boundaries must be created as separate entries for each district or ZIP area.
- Empty or duplicate key values are not allowed in the hierarchy (e.g. missing district references or repeated street names in the same district).
Coordinates and fallback behavior
Normally, address coordinates are resolved dynamically down to street or even house number level.
If this resolution is not possible for a specific area, the system falls back to the coordinates of the corresponding district.
For this reason, the latitude and longitude values for each district must be accurate and represent the geographic center of that district.
If these coordinates are imprecise, locations may be displayed incorrectly. The district coordinates are the lowest fallback level used when no more detailed coordinates are available.
Support through data sources
In addition to providing structured datasets directly, it is especially helpful if you can point us to official or open data sources that contain address information for entire countries or regions.
Suitable examples include:
- national or regional address registers
- municipal or governmental geodata portals
- open administrative databases (open data)
- freely available geodata such as OpenStreetMap
We will review these sources and prepare the data for integration into our system.
Goal
With this information, we can build complete address data packages for additional countries. Any contribution, whether in the form of structured data or references to public sources, helps us expand coverage more quickly.
If you have datasets or know of suitable sources that follow or approximate this structure, please share them here in the forum. We will review each submission and determine how it can be integrated.