Tiles@home/Website/FilesystemStorage
Summary
Come up with a hash that allows all the tiles (up to 2^18 per layer) to be stored in the filesystem, such that each directory contains only a manageable number of subdirectories or files.
Current status
- 55GB of tiles now in the filesystem
- Can be viewed using the existing API
- Can be uploaded using the existing API
- Metadata is not available for tiles being uploaded at the moment
Inodes problem
- We have a new problem - there are currently 15M tiles, but only 10M inodes free on the filesystem... I suggest that each zoom level from z12 to z17 is stored in a direct-access archive (not tar), so the clients push up either 1 or 6 files to the server (Q: should all 6 zoom levels be in one [big] file?). Server has a C app that directly pulls the PNG from the file. This will save a lot of inodes, and disk space as there is no block tail space wasted. The server does not need to create these files ever, only pull images from them. I have written a perl script to create a custom archive, but maybe ar would be a different choice (not good for people running non *IX systems, though). Matthew Newton 15:42, 24 February 2007 (UTC)
- Instead of adding all this extra complexity, why not just use a loopback-mounted filesystem, as somebody suggested on the mailing-list? A quick read of [1] suggests this should not be too inefficient. An ext3 filesystem with appropriate blocksize and bytes/inode ratio would be simple to create, and would only require an admin to mount it. --Bobkare
- Probably the second-best solution after formatting a real partition. But it would create one huge file in the host file system (require special treatment when doing backups and so on, make sure host supports 2+ GB files), and depending on the client file system it might be necessary to actually create a file that has all the "extra space" we may need in the coming months already allocated, thus immediately using much more space on the host file system than would be required by a slowly groing amount of individual tiles. I believe some file systems can be resized but I do not know how this plays together with a loopback-mounted file. --Frederik Ramm 21:01, 24 February 2007 (UTC)
- I believe the host filesystem is ext3, which support sparse files. I believe something like dd if=/dev/zero of=tilesfilesystem.img bs=1024 seek=2000000 count=0 would create a 2GB sparse file which a filesystem could then be created on. The file would only take up as much space as the filesystem is actually using. --Bobkare
- If we go down the archive file format I'd suggest checking the Internet Archive's format. Basically it's a mix of 100MiB sized containers and a database for fast access. Dekarl 10:33, 26 February 2007 (UTC)
- I believe the host filesystem is ext3, which support sparse files. I believe something like dd if=/dev/zero of=tilesfilesystem.img bs=1024 seek=2000000 count=0 would create a 2GB sparse file which a filesystem could then be created on. The file would only take up as much space as the filesystem is actually using. --Bobkare
- Probably the second-best solution after formatting a real partition. But it would create one huge file in the host file system (require special treatment when doing backups and so on, make sure host supports 2+ GB files), and depending on the client file system it might be necessary to actually create a file that has all the "extra space" we may need in the coming months already allocated, thus immediately using much more space on the host file system than would be required by a slowly groing amount of individual tiles. I believe some file systems can be resized but I do not know how this plays together with a loopback-mounted file. --Frederik Ramm 21:01, 24 February 2007 (UTC)
- Why not just add a new disk/partition and spread the tiles on different partitions or disks based on zoom level or some other easy to handle "hash". Same way that admins split up /home on large installations (/home/a, /home/b, etc). This would also work as a simple I/O load balancer if multiple disks are used. Onion 13:48, 6 March 2007 (GMT+2)
- Instead of adding all this extra complexity, why not just use a loopback-mounted filesystem, as somebody suggested on the mailing-list? A quick read of [1] suggests this should not be too inefficient. An ext3 filesystem with appropriate blocksize and bytes/inode ratio would be simple to create, and would only require an admin to mount it. --Bobkare
- You have a problem like most of large email systems does - I think that all looback/single file/archive are just a crap because they will be much slower and in future there can be a compatibility problem with it. So just use hash function to spread files across MANY disks. If you don't have enterprise solutions (for best performance many raided disks - which will do most of IO) you can have LVM (or similar) volumes. This approach is good because if you choose to do single volume for every zoom level, you can extend volumes depending on it's needs - but because bigger zoom levels has more titles it not good solution for performance. So best you can do is to connect some disk array, split it in to volumes (in smart way) - and do simple 2 or more level storage hash algorithm: disk1....disk10 (first hash level), then directory structure like for example squid has. Adrian Siemieniak 21.10.2007
Using mod-rewrite
The following RewriteRules should work. The last two lines also redirect to a script that can fetch a tile from the DB if it's not in the filesystem in a transitional period.
RewriteEngine on RewriteBase /~bob/osmtiles/ RewriteRule ^([0-9]+)/([0-9]+)/([0-9]+).png$ 00$1/00000$2/00000$3.png RewriteRule ^0*([0-9]{2})/0*([0-9]{3})([0-9]{3})/0*([0-9]{3})([0-9]{3}).png $1/$2/$3/$4/$5.png RewriteCond %{REQUEST_FILENAME} !-f RewriteRule .* tile.php?tile=%{REQUEST_URI}
The script will get weird paths passed, I think it's because of this bug: [2]
- Would it be possible or even advisable to have the script automatically move the tile from the db to the fs? In a way pre-caching the tile for the next request. Problems: race condition with upload/update, additional load on the db... --Deelkar
- Perhaps it could submit tiles into a queue, which would then be read by a script that moves the tiles, and takes proper care to not overload the machine? --Bobkare
- The tile upload script (which will put new tiles on the FS) must also then remove them from the DB, of course. The download script will get the tile from the database, so it might as well drop it into the filesystem (safe: call it yyy.png.$$ and mv it to the correct location) and rm it from the DB. The lack of MySQL locks should not cause problems here, from what I can see. Matthew Newton 20:37, 23 February 2007 (UTC)
- I do not see any reason to move tiles to and from the database during production use. At the time the switch is made, export all tiles from the database into the file system, and after that, no more tiles in the database. Ever. Uploaded tiles go directly to the file system. Downloaded tiles come directly from the file system.
- If there is a reason why having tiles in the database is good and that I am overlooking, then use Apache's cache mechanism to store the files somewhere for fast access and have the master in the database - but that doubles the amount of space used and, as stated before, I cannot see the merits.
- As for the directory structure created by above RewriteRules: Well thought out (maybe one "0" less in the first RewriteRule will suffice?) But as I've pointed out here, 2^18 files per directory level shouldn't be a problem. (Note that you may have to re-create the file system to allow tons of Inodes - either way).
How secure is mod_rewrite? If we implemented those rules as above -- is it possible to construct a URL which resolves to something that isn't a tile? (e.g. to read files in non-public directories) Ojw 16:16, 24 February 2007 (UTC)
- The rules are quite foolproof IMHO. I'd skip the last one that implements the fallthrough to a script (won't do any good anyway). So you have two rules which both re-write something.png to somethingelse.png, in the process moving around only numbers. No way someone could sneak in a "../" or "/etc/passwd" or something. (Note that it is possible to write broken rewrite rules that in fact allow access to different files on the system! But if they are contained in a <Directory> element in the configuration OR include a RewriteBase like those above do, then any rewriting takes only place inside that "sandbox", and you cannot go outside.) --Frederik Ramm 18:47, 24 February 2007 (UTC)--Frederik Ramm 18:47, 24 February 2007 (UTC)
PHP Functions
To convert from web format (06/123/34.png to 06/000/123/000/034.png):
function fromwebtodir($file) { $newfile = preg_replace("/(\d+)\/(\d+)\/(\d+)\.png/", "00$1/00000$2/00000$3.png", $file); $newfile = preg_replace("/0*(\d{2})\/0*(\d{3})(\d{3})\/0*(\d{3})(\d{3})\.png/", "$1/$2/$3/$4/$5.png", $newfile); return $newfile; }
alternative version that does not use regexps, but looks horrible ;)
function fromwebtodir($file) { $nums = substr($file, 0, strpos($file, ".")); $n[0] = strtok($nums, "/"); $n[1] = strtok("/"); $n[2] = strtok("/"); $n[0] = substr("000" . $n[0], -2); $n[3] = substr("00000" . $n[2], -6, 3); $n[4] = substr("00000" . $n[2], -3, 3); $n[2] = substr("00000" . $n[1], -3, 3); $n[1] = substr("00000" . $n[1], -6, 3); return join("/", $n) . ".png"; }
other proposal by User:Frederik Ramm (but untested, may need to remove .png from $file before):
function fromwebtodir($file) { $pathElements = explode($file, "/"); return sprintf("%02d/%03d/%03d/%03d/%03d.png", $pathElements[0], $pathElements[1]/1000, $pathElements[1]%1000, $pathElements[2]/1000, $pathElements[2]%1000); }
but honestly I (F.R.) do not see any reason why that code would ever be needed in PHP since this translation is handled by the Apache rewrite rule exclusively?
- Because the upload script needs to know where to put things? Ojw 13:23, 24 February 2007 (UTC)
...and from dir format back to web format:
function fromdirtoweb($file) { $newfile = preg_replace("/(\d+)\/(\d+)\/(\d+)\/(\d+)\/(\d+)\.png/", "$1/$2$3/$4$5.png", $file); $newfile = preg_replace("/0*([1-9]\d*)\/0*([1-9]\d*)\/0*([1-9]\d*)\.png/", "$1/$2/$3.png", $newfile); return $newfile; }
-- Matthew Newton 21:53, 23 February 2007 (UTC)
OJW's version of the rewrite rules
As installed in /~ojw/Tiles/.htaccess
RewriteEngine on RewriteBase /~ojw/Tiles/ RewriteRule ^tile.php/([0-9]+)/([0-9]+)/([0-9]+).png$ 00$1/00000$2/00000$3.png RewriteRule ^0*([0-9]{2})/0*([0-9]{3})([0-9]{3})/0*([0-9]{3})([0-9]{3}).png data/Tiles/$1/$2/$3/$4/$5.png RewriteRule ^(.*)_exists exists.php RewriteRule ^(.*)_details details.php RewriteCond %{REQUEST_FILENAME} !-f RewriteRule .* db.php?tile=%{REQUEST_URI}
/~ojw/Tiles/data is a symlink to the tile area (which contains a Tiles/ directory)
Ojw 16:13, 4 March 2007 (UTC)