User:EmericusPetro/sandbox/Wiki-as-base
Jump to navigation
Jump to search
This page is an example test page for Wiki-as-base, strategy that make Media Wiki pages become an w:Storage_virtualization beyond the already data mined infoboxes used by Taginfo et al (see Taginfo/Parsing_the_Wiki). Current proof of concepts are python package wiki_as_base and serverless / function as service with OpenFaaS. Sorry for lack of better detailed documentation, but as 2023-01-03, the parsing strategies by Wiki-as-base are is work in progress.
- Server side (TL;DR do not consume more than 50% of server resources)
- With command line (required
pip install wiki_as_base --upgrade
)- JSON-LD:
wiki_as_base --page-title 'User:EmericusPetro/sandbox/Wiki-as-base'
- Zip
wiki_as_base --page-title 'User:EmericusPetro/sandbox/Wiki-as-base' --output-zip-file /tmp/wiki-as-base.zip
- JSON-LD:
Data
YAML database
___wikiasbase: true
data:
test_string: hello world!
test_list:
- aa
- bb
- cc
test_number: 123
Another non example
- ignore
- this
- list
Another example
# Example of data to be parsed outside
___wikiasbase: true
data:
test_string: test2
RDF
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX osmnode: <https://www.openstreetmap.org/node/>
PREFIX osmrel: <https://www.openstreetmap.org/relation/>
PREFIX osmway: <https://www.openstreetmap.org/way/>
PREFIX osmm: <https://example.org/todo-meta/>
PREFIX osmt: <https://wiki.openstreetmap.org/wiki/Key:>
PREFIX osmx: <https://example.org/todo-xref/>
PREFIX wikidata: <http://www.wikidata.org/entity/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
osmnode:1
osmm:changeset 124176968 ;
osmm:loc "Point(42.7957187 13.5690032)"^^geo:wktLiteral ;
osmm:timestamp "2022-07-28T09:47:39Z"^^xsd:dateTime ;
osmm:type "n" ;
osmm:user "owene" ;
osmm:userid 29598 ;
osmm:version 33 ;
osmt:communication:microwave "yes" ;
osmt:communication:radio "fm" ;
osmt:description "Radio Subasio" ;
osmt:frequency "105.5 MHz" ;
osmt:man_made "mast" ;
osmt:name "Monte Piselli - San Giacomo" ;
osmt:tower:construction "lattice" ;
osmt:tower:type "communication" ;
.
SPARQL
owl-classes.rq
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?class ?label ?description
WHERE {
?class a owl:Class.
OPTIONAL { ?class rdfs:label ?label}
OPTIONAL { ?class rdfs:comment ?description}
}
LIMIT 25
Wiki template
{{ValueDescription
|key=highway
|value=residential
|image=File:Residential.jpg
|description=Road in a residential area
|osmcarto-rendering=File:Rendering-highway_residential.png
|osmcarto-rendering-size=125px
|group=highways
|onNode=no
|onWay=yes
|onArea=no
|onRelation=no
|combination=
* {{tag|name}}
* {{tag|oneway}}
* {{Tag|surface}}
* {{Tag|access}}
|seeAlso=
* {{Tag|highway|unclassified}}
* {{Tag|highway|living_street}}
* {{Tag|highway|service}}
* {{Tag|highway|tertiary}}
|implies = * {{tag|access|yes}}
|status=de facto
}}
Tables
See https://en.wikipedia.org/wiki/Help:Table
Style 1
Header text | Header text | Header text |
---|---|---|
Example | Example | Example |
Example | Example | Example |
Example | Example | Example |
Style 1
keyaaa | keybbb | keyccc | keyddd |
---|---|---|---|
1 | value 1 bbb | value 2 ddd | |
2 | value 2 bbb | value 2 ccc | value 3 ddd |
3 | value 3 bbb | value 3 ccc | value 3 ddd |
Database Schemas
Table from https://wiki.openstreetmap.org/wiki/Databases_and_data_access_APIs
Schema name | Created with | Used by | Primary use case | Updatable | Geometries (PostGIS) | Lossless | hstore columns | Database |
---|---|---|---|---|---|---|---|---|
osm2pgsql | osm2pgsql | Mapnik, Kothic JS | Rendering | yes | yes | no | optional | PostgreSQL |
apidb | osmosis | API | Mirroring | yes | no | yes | no | PostgreSQL, MySQL |
pgsnapshot | osmosis | jXAPI | Analysis | yes | optional | yes | yes | PostgreSQL |
imposm | Imposm | Rendering | no | yes | no | Imposm2: no, Imposm3: yes | PostgreSQL | |
nominatim | osm2pgsql | Nominatim | Search, Geocoding | yes | yes | yes | ? | PostgreSQL |
ogr2ogr | ogr2ogr | Analysis | no | yes | no | optional | various | |
osmsharp | OsmSharp | Routing | yes | no | ? | ? | Oracle | |
overpass | Overpass API | Analysis | yes | ? | yes | ? | custom | |
mongosm | MongOSM | Analysis | maybe | ? | ? | ? | MongoDB | |
node-mongosm | Mongoosejs | Analysis | yes | yes | yes | NA | MongoDB | |
osmium | Osmium | Analysis | no | yes | no | yes | PostgreSQL |
Structure
From https://wiki.openstreetmap.org/wiki/Node
name | value | description | |
---|---|---|---|
id | 64-bit integer number ≥ 1 |
Node ids are unique between nodes. (However, a way or a relation can have the same id number as a node.) Editors may temporarily save node ids as negative to denote ids that haven't yet been saved to the server. Node ids on the server are persistent, meaning that the assigned id of an existing node will remain unchanged each time data are added or corrected. Deleted node ids must not be reused, unless a former node is now undeleted. | |
lat | decimal number ≥ −90.0000000 and ≤ 90.0000000 with 7 decimal places |
Latitude coordinate in degrees (North of equator is positive) using the standard WGS84 projection. Some applications may not accept latitudes above/below ±85 degrees for some projections. | Do not use IEEE 32-bit floating point data type since it is limited to about 5 decimal places for the highest longitude. A 32-bit method used by the Rails port is to use an integer (by multiplying each coordinate in degrees by 1E7 and rounding it: this allows to cover all absolute signed coordinates in ±214.7483647 degrees, or a maximum difference of 429.4967295 degrees, a bit more than what is needed). For computing projections, IEEE 64 bit floating points are needed for intermediate results. The 7 rounded decimal places for coordinates in degrees define the worst error of longitude to a maximum of ±5.56595 millimeters on the Earth equator, i.e. it allows building maps with centimetric precision. With only 5 decimal places, the precision of map data would be only metric, causing severe changes of shapes for important objects like buildings, or many zigzags or angular artefacts on roads. |
lon | decimal number ≥ −180.0000000 and ≤ 180.0000000 with 7 decimal places |
Longitude coordinate in degrees (East of Greenwich is positive) using the standard WGS84 projection. Note that the geographic poles will be exactly at latitude ±90 degrees but in that case the longitude will be set to an arbitrary value within this range. | |
tags | A set of key/value pairs, with unique key | See Map features for tagging guidelines. |
Data validation
SHACL rule
# filename = R001_wikidata.shacl.ttl
PREFIX osmm: <https://example.org/todo-meta/>
PREFIX osmnode: <https://www.openstreetmap.org/node/>
PREFIX osmt: <https://wiki.openstreetmap.org/wiki/Key:>
PREFIX osmway: <https://www.openstreetmap.org/way/>
PREFIX osmw: <https://wiki.openstreetmap.org/wiki/>
PREFIX osmx: <https://example.org/todo-xref/>
PREFIX osmvsh: <https://www.openstreetmap.org/validation/shacl/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX ex: <https://example.org/>
# "wikipedia/wikidata type tag that is incorrect according to not:* tag" via @matkoniecz
osmvsh:R001_wikidata
a sh:NodeShape ;
sh:targetClass osmw:Node, osmw:Way, osmw:Relation ;
sh:property [
sh:path osmt:brand:wikidata ;
sh:disjoint osmt:not:brand:wikidata ;
sh:message "Impossible to be, and not to be, at the same time something"@en ;
sh:message "Impossível ser, e não ser, ao mesmo tempo algo"@pt ;
]
.
Data example
# filename = R001_wikidata-valid.tdata.ttl
PREFIX osmway: <https://www.openstreetmap.org/way/>
PREFIX osmt: <https://wiki.openstreetmap.org/wiki/Key:>
PREFIX osmw: <https://wiki.openstreetmap.org/wiki/>
osmway:220660292
a osmw:Way ;
osmt:brand:wikidata "Q1" ;
osmt:not:brand:wikidata "Q2" ;
.
# filename = R001_wikidata-invalid.tdata.ttl
PREFIX osmnode: <https://www.openstreetmap.org/node/>
PREFIX osmway: <https://www.openstreetmap.org/way/>
PREFIX osmt: <https://wiki.openstreetmap.org/wiki/Key:>
PREFIX osmw: <https://wiki.openstreetmap.org/wiki/>
osmway:220660292
a osmw:Way ;
osmt:brand:wikidata "Q701338" ;
osmt:not:brand:wikidata "Q701338" ;
.
Heading 1
See https://en.wikipedia.org/wiki/Help:Wikitext#Sections
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Description lists
See https://en.wikipedia.org/wiki/Help:Wikitext#Lists
Same line
- Term
- Definition1
Line by line
- Term
- Definition1
- Definition2
- Definition3
- Definition4