User:Arjunaraoc/Updating wikidata for places in Andhra Pradesh

From OpenStreetMap Wiki
Jump to navigation Jump to search

Data sources of places

Initial creation of places appear to have started with importing highway info from AND with user name of AND. Later About 41K places data was gathered from Bhuvan during 2018 and was subsequently uploaded to OSM. As on 2024-04-01, there are 32676 places on OSM, considering all sources. At the time of the data acquisition, this data is protected. Subsequently SOI released 1:50000 scale open series maps under National Geospatial policy in 2021-2022. SOI Geoportal made available open series maps, which were used by OSM users for village boundary updates.

As per Local government directory (which is based on census 2011 data), there are 17950 (including uninhabited, forest etc) villages. Wikidata and Telugu wiki have info about 15622 inhabited revenue villages. As per egramswaraj.gov.in , Andhra Pradesh has 13326 gram panchayats, with info on 13318. There were 32676 places in OSM covering Andhra Pradesh as on 2024-04-01.

Based on Karmashapes curation which is CC-BY licenses, the number of habitations in Andhra Pradesh is 43563. The data is available at https://indianopenmaps.fly.dev/habitations/karmashapes-points/%7Bz%7D/%7Bx%7D/%7By%7D.pbf and is hosted for viewing at https://indianopenmaps.fly.dev/habitations/karmashapes-points/view#4/22.5/76.5 . (The github link for other related info is at https://github.com/ramSeraph/indian_admin_boundaries/releases/tag/habitations .

Earlier discussions about data copyright issues: https://community.openstreetmap.org/t/is-the-license-of-alltheplaces-suitable/99999/46 clarify that the responsibility is on uploader of data and that the local groups are best placed to assess whether the upload is not violating policies and is ethical. OSMIndia Telegram at https://t.me/OSMIndia/28718 clarify that India's National Geospatial Policy provides the released SOI data as common good. There were no responses to this message on Telegram.

Creation and last modification analysis as on 2024-04-02

Data was analsysed on 2024-04-02 with data collected on the same date by using the script https://overpass-turbo.eu/s/1JoU . Adivik2000 accounts for 78.89% of created places without further modifications and 33.26% of places with latest modification (version >1) There were total 32671 places, with 77.7% left untouched after creation. 105 accounts contributed to creating places which were not modified. 215 accounts participated in making atleast one modification to an already created place. 9 accounts account for upto 80% of modifcations.

Places were created in bulk in 2018. Modification also peaked in 2018, with the next highest peak in 2024. Out of top 10 modifiers, two accounts were deleted. Here are more details.

  • Created with no further modifications: 25387
    • top 10 creators
@user COUNT of @id cumulative cum percentage
adivik2000 20028 20028 78.89%
chandusekharreddy 2216 22244 87.62%
Harshavts 855 23099 90.99%
Srihari Thalla 452 23551 92.77%
Harsha1999 297 23848 93.94%
sfrank 296 24144 95.10%
Nivas234 286 24430 96.23%
Ortsangabe 190 24620 96.98%
muzirian 111 24731 97.42%
Somnath391 78 24809 97.72%
    • creation year
year COUNT of @id
2008 4
2009 1
2010 1
2011 9
2012 43
2013 12
2014 36
2015 79
2016 250
2017 403
2018 22460
2019 66
2020 349
2021 22
2022 13
2023 1061
2024 578


  • Modified places: 7284
    • Last modified users till 80%
@user COUNT of @id cumulative percentage
adivik2000 2423 2423 33.26%
arjunaraoc 1375 3798 52.14%
mottiger 704 4502 61.81%
Harshavts 390 4892 67.16%
Nivas234 240 5132 70.46%
ph4ni 190 5322 73.06%
tg4567 184 5506 75.59%
mueschel 183 5689 78.10%
Harsha1999 181 5870 80.59%
    • Last modified year
year COUNT of @id
2009 1
2010 1
2011 4
2012 8
2013 7
2014 10
2015 82
2016 31
2017 58
2018 2469
2019 629
2020 1150
2021 108
2022 112
2023 1126
2024 1488

Process

  1. Generate osm (csv) data using overpass-turbo
  2. Generate osm data using overpass-turbo
  3. Generate wd data using wd query
  4. Open osm (csv) data in Openrefine
  5. Setup reconcilation service using WD data
  6. Reconcile based on the English name
  7. Add matched wikidata, name:te columns from wikidata after reconciling
  8. Create osmfullid from type and id columns
  9. Export resulting table as csv
  10. Run osmcsvappender to add to osm data with info from result table
  11. Open JOSM with the updated osm file
  12. Validate changes and upload

SPSR Nellore district, Kandukur mandal places

OSM data

  • Total: 26
  • Wikidata available: 4

WD data

  • Total: 20

Tewiki data

  • Revenue Village(RV): 14
  • Non RV:12

Result

  • Reconciled total: 12
  • WD existing: 7
  • Updates done: 5 (only wikidata)

Time taken

Time taken: Not noted

Bapatla district, Parchur mandal

OSM data

  • Total: 26
  • Wikidata available: 4

WD data

  • Total: 20

Tewiki data

  • RV: 14
  • Non RV:12

Result

  • Reconciled total: 11
  • WD existing: 4
  • Updates done: 11 (only wikidata)

Time taken

Time taken: Not noted

Bapatla district, Bapatla mandal

OSM data

  • Total: 45
  • Wikidata available: 4

WD data

  • Total: 19

Tewiki data

  • RV: 18
  • Non RV:9

Result

  • Reconciled total: 14
  • WD existing: 4
  • Updates done: 10 (wikidata, name:te)

Time taken

Time taken: 1 hour

Bapatla district, other mandals

  • Karlapalem mandal - 1 (one duplicate node for Karlapalem village deleted)

Andhra Pradesh places update estimation

Note: The counts were obtained on 2024-04-03 and could vary based on the modifications in various platforms at the time of actual execution of scripts

  1. Number of mandals : 679
  1. number of places in OSM: 32676
    1. number of places with wikidata: 1749
    2. number of places with name:te : 1680
    3. number of places with wikidata and name:te : 845 (1675 on 2024-04-05)
  2. number of villages in tewiki: 16768
  3. number of villages in wikidata with link to mandal: 16647
  4. number of revenue villages in wd: 15622
  5. Villages in enwiki:2242


  • Number of person hours required: 679 hours (assuming one hour per each mandal)
  • Number of potential updates for places= 6885 (basis 26 updates for 59 RV for three mandals or 44%)

Progress

From 2024-04-03 (# of modifications in brackets)

  • Based on existing Wikidata, add name:te (775)
  • Check when name:te's are blank with wikidata value existing. Reconcile with Telugu Wikipedia articles and OSM, to link articles and update name:te, Update Wikidata which became redirects with their target values (89)

Useful tips

  1. Identify old id redirects for villages For fixing old wikidata ids which have become redirects on OSM