BDOT10k buildings import
BDOT10k buildings import is an import of building data from BDOT10k dataset which is of type building data covering Poland. The import is currently (as of 2023-07) in progress.
Please read here for up-to-date information about this import.
The import is uploaded from the NorthCrab_upload (on osm) account.
Goals
The goal of this import is to enhance the OpenStreetMap database with building data from BDOT10k, using AI to verify the correctness of the data, and historical OSM data to identify previously deleted buildings.
Schedule
The project is ongoing, with updates being made as the AI tool is refined and new data is available.
Import Data
Background
Data source site: https://bdot10k.geoportal.gov.pl/ and https://www.geoportal.gov.pl/dane/ortofotomapa
Data license: https://www.geoportal.gov.pl/regulamin
Type of license: Pl:Geoportal.gov.pl
OSM attribution: source=www.geoportal.gov.pl
ODbL Compliance verified: yes
OSM Data Files
The script preparing data for import can be found in the GitHub repository.
Import Type
This is a recurring import that is performed using an automated script.
The data is entered into the OSM database using the OSM API.
Data Preparation
Data Reduction & Simplification
The AI tool fetches building data from BDOT10k, retrieves ortophoto imagery for each of the buildings, and preprocesses it.
It then uses a fine-tuned MobileNetV3Large model to classify the correctness of the BDOT10k information, reducing the amount of incorrect data that is imported.
The Polish community agreed on error rate of 0.3% as being reasonable.
Tagging Plans
As provided by the https://budynki.openstreetmap.org.pl/ website.
Key | Value |
---|---|
building or man_made | * |
building:levels | * |
source:building | BDOT |
Data Transformation
The data is transformed using a Python script, which can be found in the GitHub repository.
Data Transformation Results
There are no exact data transformation results. The data is transformed on-demand.
Data Merge Workflow
Team Approach
This import is being performed by a single individual, with community input and discussion.
References
All references are included in the related text as hyperlinks.
Workflow
- Fetch building data from BDOT10k.
- Retrieve and preprocess ortophoto imagery for each building.
- Use the AI model to classify the correctness of the BDOT10k information.
- Verify historical OSM data to identify previously deleted buildings.
- Import new buildings into OSM.
Changesets will be kept to a manageable size, and any issues that arise will be addressed promptly.
Conflation
Any buildings that already exist in OSM database are skipped.
QA
Quality assurance is performed in several steps:
- During the model creation, manually classified data is split into training and testing datasets. The precision of the model is measured on the test data prior to production release.
- Some changesets are manually verified by the community.