| 1 | :Mon Dec 13 14:54:09 GMT 2004 | 1 | :Mon Dec 13 14:54:09 GMT 2004 |
|---|
| 2 | | 2 | |
|---|
| 3 | From Matt Webb: | 3 | From Matt Webb: |
|---|
| 4 | | 4 | |
|---|
| 5 | >here's my scenario, in which the system i'm building interacts with a black | 5 | >here's my scenario, in which the system i'm building interacts with a black |
|---|
| 6 | >box, X: i ask X, please subscribe to these syndication feeds, please get | 6 | >box, X: i ask X, please subscribe to these syndication feeds, please get |
|---|
| 7 | >anything on del.icio.us and Flickr tagged with "foo" [1]. i wait for a few | 7 | >anything on del.icio.us and Flickr tagged with "foo" [1]. i wait for a few |
|---|
| 8 | >days. i then use the bloglines API to pull out weblog entries, and some api or | 8 | >days. i then use the bloglines API to pull out weblog entries, and some api or |
|---|
| 9 | >another to pull out the tagged information, and maybe another to do a search | 9 | >another to pull out the tagged information, and maybe another to do a search |
|---|
| 10 | >across the whole datastore for a URL [in the feed text] or keywords. X has gone | 10 | >across the whole datastore for a URL [in the feed text] or keywords. X has gone |
|---|
| 11 | >away and looked after fetching and storing feeds, fixing rss 0.91, and throwing | 11 | >away and looked after fetching and storing feeds, fixing rss 0.91, and throwing |
|---|
| 12 | >errors for 404'd feeds. | 12 | >errors for 404'd feeds. |
|---|
| 13 | | 13 | |
|---|
| 14 | :Tue Dec 14 15:00:23 GMT 2004 | 14 | :Tue Dec 14 15:00:23 GMT 2004 |
|---|
| 15 | | 15 | |
|---|
| 16 | http://www-106.ibm.com/developerworks/xml/library/x-rdfprov.html is edd's article on tracking rss provenance etc with redland contexts. this will be a useful approach, espec for ensuring that feeds are hosted on the same domains they're talking about, for events in the future. we may even be able to subclass edd's aggregator package as-is, then provide simple gateways for other feed formats in and out. | 16 | http://www-106.ibm.com/developerworks/xml/library/x-rdfprov.html is edd's article on tracking rss provenance etc with redland contexts. this will be a useful approach, espec for ensuring that feeds are hosted on the same domains they're talking about, for events in the future. we may even be able to subclass edd's aggregator package as-is, then provide simple gateways for other feed formats in and out. |
|---|
| 17 | | 17 | |
|---|
| 18 | need to make sure that epistomat either has sensible support for contexts, or that we can provide it in a non-gnarly way. we can probably also augment fraggle with our more pleasant syntax for uris. | 18 | need to make sure that epistomat either has sensible support for contexts, or that we can provide it in a non-gnarly way. we can probably also augment fraggle with our more pleasant syntax for uris. |
|---|
| 19 | | 19 | |
|---|
| 20 | looking at edd's code as it stands, it's very low-level, full of workarounds for things that have since been fixed in the redland python API; a place to start, though... it mentions TODO: recording last-modified and using if-modified-since: we need to get that working with urllib2. http://www.btree.net/python/http_web_services/etags.html runs through this process. | 20 | looking at edd's code as it stands, it's very low-level, full of workarounds for things that have since been fixed in the redland python API; a place to start, though... it mentions TODO: recording last-modified and using if-modified-since: we need to get that working with urllib2. http://www.btree.net/python/http_web_services/etags.html runs through this process. |
|---|
| 21 | | 21 | |
|---|
| 22 | :Wed Dec 15 17:02:44 GMT 2004 | 22 | :Wed Dec 15 17:02:44 GMT 2004 |
|---|
| 23 | | 23 | |
|---|
| 24 | http://sourceforge.net/projects/feedparser/ | 24 | http://sourceforge.net/projects/feedparser/ |
|---|
| 25 | | 25 | |
|---|
| 26 | is mark pilgrims last-ditch rss parser thing. i'd be happiest, i suppose if it did straight transformation of any feed format into rss1. let's see... | 26 | is mark pilgrims last-ditch rss parser thing. i'd be happiest, i suppose if it did straight transformation of any feed format into rss1. let's see... |
|---|
| 27 | | 27 | |
|---|
| 28 | happily it seems to have good handling for last-modified and etag based requests; i only have to receieve and send the right headers from the store. it doesn't seem to do transformation, just build data structures from common feed elements and provide a nice interface for accessing properties... | 28 | happily it seems to have good handling for last-modified and etag based requests; i only have to receieve and send the right headers from the store. it doesn't seem to do transformation, just build data structures from common feed elements and provide a nice interface for accessing properties... |
|---|
| 29 | | 29 | |
|---|
| 30 | having a look at the state of the redland store after running edd's decmo, it holds a model like this: | 30 | having a look at the state of the redland store after running edd's decmo, it holds a model like this: |
|---|
| 31 | | 31 | |
|---|
| 32 | {(r1103038973r1), [http://www.w3.org/1999/02/22-rdf-syntax-ns#_8], [http://sippey.com/archives/000757.php]} {{{[http://usefulinc.com/fraggie/fetch/1]}}} | 32 | {(r1103038973r1), [http://www.w3.org/1999/02/22-rdf-syntax-ns#_8], [http://sippey.com/archives/000757.php]} {{{[http://usefulinc.com/fraggie/fetch/1]}}} |
|---|
| 33 | {(r1103038973r1), [http://www.w3.org/1999/02/22-rdf-syntax-ns#_9], [http://www.scottandrew.com/main/2003_07#a000695]} {{{[http://usefulinc.com/fraggie/fetch/1] | 33 | {(r1103038973r1), [http://www.w3.org/1999/02/22-rdf-syntax-ns#_9], [http://www.scottandrew.com/main/2003_07#a000695]} {{{[http://usefulinc.com/fraggie/fetch/1] |
|---|
| 34 | | 34 | |
|---|
| 35 | this suggests we should keep an incrementing counter per feed, as well as a counter per fetch of it,to keep these numbers in a serial order? we shouldn't worry about it too much as most of the output to queries will be lists of things constructted in date order. so what is the point of storing the sequentiality of items at all? we could plan for this but not bother in the first iteration, where all we need is a statement item -> partof -> channel. | 35 | this suggests we should keep an incrementing counter per feed, as well as a counter per fetch of it,to keep these numbers in a serial order? we shouldn't worry about it too much as most of the output to queries will be lists of things constructted in date order. so what is the point of storing the sequentiality of items at all? we could plan for this but not bother in the first iteration, where all we need is a statement item -> partof -> channel. |
|---|
| 36 | | 36 | |
|---|
| 37 | :Wed Dec 22 16:01:35 GMT 2004 | 37 | :Wed Dec 22 16:01:35 GMT 2004 |
|---|
| 38 | | 38 | |
|---|
| 39 | i am starting to sketch out code and made a distribution here, which includes the epistomat source and that of mark pilgrim's feedparser. i got distracted by this article of his which was hevaily linked to on the foaf wiki; the scutter vocab material there, turned out to be not much use. | 39 | i am starting to sketch out code and made a distribution here, which includes the epistomat source and that of mark pilgrim's feedparser. i got distracted by this article of his which was hevaily linked to on the foaf wiki; the scutter vocab material there, turned out to be not much use. |
|---|
| 40 | | 40 | |
|---|
| 41 | This <a href"http://diveintomark.org/archives/2003/07/21/atom_aggregator_behavior_http_level">mark pilgrim article about feed aggregation behaviour</a> looks like a good read, anyway. | 41 | This <a href"http://diveintomark.org/archives/2003/07/21/atom_aggregator_behavior_http_level">mark pilgrim article about feed aggregation behaviour</a> looks like a good read, anyway. |
|---|
| 42 | | 42 | |
|---|
| 43 | :Tue Jan 11 07:49:03 IST 2005 | 43 | :Tue Jan 11 07:49:03 IST 2005 |
|---|
| 44 | | 44 | |
|---|
| 45 | eek, it's been a while. ongoing notes: | 45 | eek, it's been a while. ongoing notes: |
|---|
| 46 | | 46 | |
|---|
| 47 | | 47 | |
|---|
| 48 | import httpserver | 48 | import httpserver |
|---|
| 49 | | 49 | |
|---|
| 50 | bloglines API for retrieval | 50 | bloglines API for retrieval |
|---|
| 51 | | 51 | |
|---|
| 52 | feed mgmt - model, collections, collection instances | 52 | feed mgmt - model, collections, collection instances |
|---|
| 53 | | 53 | |
|---|
| 54 | learning from past response rate - an urgency parameter which is calculated from the mean time between changes. | 54 | learning from past response rate - an urgency parameter which is calculated from the mean time between changes. |
|---|
| 55 | | 55 | |
|---|
| 56 | http://frot.org/2005/bbox/ | 56 | http://frot.org/2005/bbox/ |
|---|
| 57 | | 57 | |
|---|
| 58 | bbox:Feed | 58 | bbox:Feed |
|---|
| 59 | bbox:source | 59 | bbox:source |
|---|
| 60 | rss:channel | 60 | rss:channel |
|---|
| 61 | | 61 | |
|---|
| 62 | bbox:last_status | 62 | bbox:last_status |
|---|
| 63 | 200/403/etc | 63 | 200/403/etc |
|---|
| 64 | bbox:last_etag | 64 | bbox:last_etag |
|---|
| 65 | foo010101 | 65 | foo010101 |
|---|
| 66 | bbox:last_modified | 66 | bbox:last_modified |
|---|
| 67 | 20059020213 | 67 | 20059020213 |
|---|
| 68 | bbox:schedule | 68 | bbox:schedule |
|---|
| 69 | (hours 1-24 between fetches?) | 69 | (hours 1-24 between fetches?) |
|---|
| 70 | | 70 | |
|---|
| 71 | bbox:Visit | 71 | bbox:Visit |
|---|
| 72 | ical:datetime | 72 | ical:datetime |
|---|
| 73 | 2005etc | 73 | 2005etc |
|---|
| 74 | bbox:status | 74 | bbox:status |
|---|
| 75 | 200/500/etc | 75 | 200/500/etc |
|---|
| 76 | | 76 | |
|---|
| 77 | | 77 | |
|---|
| 78 | each item is tagged with a visit as context | 78 | each item is tagged with a visit as context |
|---|
| 79 | resolving multiples on the way out? | 79 | resolving multiples on the way out? |
|---|
| 80 | | 80 | |
|---|
| 81 | special rules: | 81 | special rules: |
|---|
| 82 | if 404 - check 5 previous fetches - if all 404 suspend | 82 | if 404 - check 5 previous fetches - if all 404 suspend |
|---|
| 83 | if 301 - follow, make note | 83 | if 301 - follow, make note |
|---|
| 84 | if 302 - follow, change bbox:source | 84 | if 302 - follow, change bbox:source |
|---|
| 85 | if 410, switch off forever | 85 | if 410, switch off forever |
|---|
| 86 | - other statuses embedded in feedparser? | 86 | - other statuses embedded in feedparser? |
|---|
| 87 | | 87 | |
|---|
| 88 | parse gives us a dict oriented model | 88 | parse gives us a dict oriented model |
|---|
| 89 | we just use timestamped items and don't use the _1, _2 etc model? | 89 | we just use timestamped items and don't use the _1, _2 etc model? |
|---|
| 90 | as this will confuse us between different sources | 90 | as this will confuse us between different sources |
|---|
| 91 | | 91 | |
|---|
| 92 | d.etag, d.modified, d.status, d.feed.has_key('foo') | 92 | d.etag, d.modified, d.status, d.feed.has_key('foo') |
|---|
| 93 | | 93 | |
|---|
| 94 | there is dc:creator support; we should patch to include foaf:maker, and always use a foaf model for creator details. | 94 | there is dc:creator support; we should patch to include foaf:maker, and always use a foaf model for creator details. |
|---|
| 95 | | 95 | |
|---|
| 96 | :Tue Feb 22 17:55:33 GMT 2005 | 96 | :Tue Feb 22 17:55:33 GMT 2005 |
|---|
| 97 | | 97 | |
|---|
| 98 | long lag, in which i've spent a couple of hours making things compile and bashing on the epistomat. to the extent that feedreader hooks up, read different formats, collapses into a model which has contexts. | 98 | long lag, in which i've spent a couple of hours making things compile and bashing on the epistomat. to the extent that feedreader hooks up, read different formats, collapses into a model which has contexts. |
|---|
| 99 | | 99 | |
|---|
| 100 | made a simple http server for the bloglines interface, and now i'm wondering about user accounts. presumably we need them; i had half-envisioned one bbox for one collection of feeds. | 100 | made a simple http server for the bloglines interface, and now i'm wondering about user accounts. presumably we need them; i had half-envisioned one bbox for one collection of feeds. |
|---|
| 101 | | 101 | |
|---|
| 102 | options | 102 | options |
|---|
| 103 | - make a bbox which doesn't know about user accounts, to test out and use for single-purpose installations (e.g, to crawl spatial info for wirelesslondon, and just have wirelesslondon talk to it) | 103 | - make a bbox which doesn't know about user accounts, to test out and use for single-purpose installations (e.g, to crawl spatial info for wirelesslondon, and just have wirelesslondon talk to it) |
|---|
| 104 | - make a bbox which has user accounts, have a stub or generic one for single-purpose uses. don't worry about user management, but have some kind of HTTP basic auth for transactions. | 104 | - make a bbox which has user accounts, have a stub or generic one for single-purpose uses. don't worry about user management, but have some kind of HTTP basic auth for transactions. |
|---|
| 105 | | 105 | |
|---|
| 106 | case b is probably better, as it won't be much harder to do, will allow us to build-in the right funcitonality straight away, and we can always have an 'all' mode superuser which can't "mark as read" which emulates case a, if that seems necessary. | 106 | case b is probably better, as it won't be much harder to do, will allow us to build-in the right funcitonality straight away, and we can always have an 'all' mode superuser which can't "mark as read" which emulates case a, if that seems necessary. |
|---|
| 107 | | 107 | |
|---|
| 108 | user-mode is not for collection of feeds, but it is for 'reading' them NNTP style and also for managing a subscription list, foaf-wise. | 108 | user-mode is not for collection of feeds, but it is for 'reading' them NNTP style and also for managing a subscription list, foaf-wise. |
|---|
| 109 | | 109 | |
|---|
| 110 | management etc can be done via the HTTP representation, The Sync API doesn't let you add subscriptions through it, so we need to create that. | 110 | management etc can be done via the HTTP representation, The Sync API doesn't let you add subscriptions through it, so we need to create that. |
|---|
| 111 | | 111 | |
|---|
| 112 | we also need to have a new component; a crawler module, that manages getting updates and http status comprehension and timing of future actions; the model in the bbox already handles that stuff, the practicalities of etags etc all supplied by feedparser, which is pretty cool. | 112 | we also need to have a new component; a crawler module, that manages getting updates and http status comprehension and timing of future actions; the model in the bbox already handles that stuff, the practicalities of etags etc all supplied by feedparser, which is pretty cool. |
|---|
| 113 | | 113 | |
|---|
| 114 | we should probably think pretty seriously about moving to twisted, though; let's look at the docs and compare to a gang of cron jobs / dodgy daemons... | 114 | we should probably think pretty seriously about moving to twisted, though; let's look at the docs and compare to a gang of cron jobs / dodgy daemons... |
|---|
| 115 | | 115 | |
|---|
| 116 | :Sun Mar 6 16:15:14 GMT 2005 | 116 | :Sun Mar 6 16:15:14 GMT 2005 |
|---|
| 117 | | 117 | |
|---|
| 118 | keep thinking about this again the the context of wirelesslondon / as a grout replacement. does what grout does for WL, with a more specialised and thought-out machine interface. has optional 'spatial extensions, basically, which are stored in PostGIS, often for mapserver's benefit, with references to URIs that are members in a Redland store. | 118 | keep thinking about this again the the context of wirelesslondon / as a grout replacement. does what grout does for WL, with a more specialised and thought-out machine interface. has optional 'spatial extensions, basically, which are stored in PostGIS, often for mapserver's benefit, with references to URIs that are members in a Redland store. |
|---|
| 119 | | 119 | |
|---|
| 120 | :Tue Mar 15 17:01:08 GMT 2005 | 120 | :Tue Mar 15 17:01:08 GMT 2005 |
|---|
| 121 | | 121 | |
|---|
| 122 | done a fair bit of work on the underlying 'framework' or what have you. The upgraded rdf-object wrapper is almost debugged and dusted. This has been largely for the benefit of other applications, for wirelesslondon and the consume nodedb. | 122 | done a fair bit of work on the underlying 'framework' or what have you. The upgraded rdf-object wrapper is almost debugged and dusted. This has been largely for the benefit of other applications, for wirelesslondon and the consume nodedb. |
|---|
| 123 | | 123 | |
|---|
| 124 | in that context i've also been having quite lovely experiences with Quixote, and can now see no reason to build http apps any other way. it can slot into twisted or fastcgi or what have you, easily. | 124 | in that context i've also been having quite lovely experiences with Quixote, and can now see no reason to build http apps any other way. it can slot into twisted or fastcgi or what have you, easily. |
|---|
| 125 | | 125 | |
|---|
| 126 | i made a nice home page for bbox: http://frot.org/bbox/ and hope to get a public svn or cvs repository together just as soon as the tests pass. (tests!) | 126 | i made a nice home page for bbox: http://frot.org/bbox/ and hope to get a public svn or cvs repository together just as soon as the tests pass. (tests!) |
|---|
| 127 | | 127 | |
|---|
| 128 | | 128 | |
|---|
| 129 | :Fri Mar 18 23:34:37 GMT 2005 | 129 | :Fri Mar 18 23:34:37 GMT 2005 |
|---|
| 130 | | 130 | |
|---|
| 131 | Flush with "getting things done", i made a simple quixote ui stub for bbox, and started emulating bloglines API functions. I'll stick this stuff in CVS now. Doesnt' do much yet, not far off. A simple temporal query outlined, a spatial boundign box (with different projections, at least wgs84 and utm zone N...?) should come next. | 131 | Flush with "getting things done", i made a simple quixote ui stub for bbox, and started emulating bloglines API functions. I'll stick this stuff in CVS now. Doesnt' do much yet, not far off. A simple temporal query outlined, a spatial boundign box (with different projections, at least wgs84 and utm zone N...?) should come next. |
|---|
| 132 | | 132 | |
|---|
| 133 | In theory redland supports RDQL and simialr query languages. The question is mappign the column-table, variable-has-value results you get back from the RDF query, into the graph which makes statements that you'd like to complete. Thsi isn't such a big deal short term. it will enable more inteersting, foafy sort of things, in the future... | 133 | In theory redland supports RDQL and simialr query languages. The question is mappign the column-table, variable-has-value results you get back from the RDF query, into the graph which makes statements that you'd like to complete. Thsi isn't such a big deal short term. it will enable more inteersting, foafy sort of things, in the future... |
|---|
| 134 | | 134 | |
|---|
| 135 | :Tues Mar 22 20:22:00 GMT 2005 | 135 | :Tues Mar 22 20:22:00 GMT 2005 |
|---|
| 136 | | 136 | |
|---|
| 137 | finally we sat down and fixed the rdfobj wrapper layer. i put a copy of it in here, involved setting PYTHONPATH to include the rdfobj directory. | 137 | finally we sat down and fixed the rdfobj wrapper layer. i put a copy of it in here, involved setting PYTHONPATH to include the rdfobj directory. |
|---|
| 138 | | 138 | |
|---|
| 139 | So this has facilitated a lot of stuff. Feeds download and are stored in the RDF model, but the clean etag/modified handling advertised by feedparser isn't seamless :/ | 139 | So this has facilitated a lot of stuff. Feeds download and are stored in the RDF model, but the clean etag/modified handling advertised by feedparser isn't seamless :/ |
|---|
| 140 | | 140 | |
|---|
| 141 | i should open up the rdf import too. i wanted to check this in before i broke anything, though. | 141 | i should open up the rdf import too. i wanted to check this in before i broke anything, though. |
|---|
| 142 | | 142 | |
|---|
| 143 | :Fri Mar 25 13:08:25 GMT 2005 | 143 | :Fri Mar 25 13:08:25 GMT 2005 |
|---|
| 144 | | 144 | |
|---|
| 145 | I stole wholeheartedly from diveintopython.org a tactful http handler, which i'm using to pick at both rss and rdf feeds. I'm still having niggles serialising the context, but bbox is definitely ready to test now. (needs more tests written, too.) | 145 | I stole wholeheartedly from diveintopython.org a tactful http handler, which i'm using to pick at both rss and rdf feeds. I'm still having niggles serialising the context, but bbox is definitely ready to test now. (needs more tests written, too.) |
|---|
| 146 | | 146 | |
|---|
| 147 | The GIS handling which i'd tentatively inserted, i removed; there is a spatialStore object in the wirelesslondon code tree, which would do the job better and more cleanly, opening up to a standalon spatial index abstraction and remove the postgis dependency which is , well, kludgy. | 147 | The GIS handling which i'd tentatively inserted, i removed; there is a spatialStore object in the wirelesslondon code tree, which would do the job better and more cleanly, opening up to a standalon spatial index abstraction and remove the postgis dependency which is , well, kludgy. |
|---|
| 148 | | 148 | |
|---|
| 149 | next is to finish the http interface - bloglines - and figure out how best to do temporal searches; on a per-feed basis we can work around that, for now. | 149 | next is to finish the http interface - bloglines - and figure out how best to do temporal searches; on a per-feed basis we can work around that, for now. |
|---|
| 150 | | 150 | |
|---|
| 151 | :Fri Apr 22 02:48:39 BST 2005 | 151 | :Fri Apr 22 02:48:39 BST 2005 |
|---|
| 152 | | 152 | |
|---|
| 153 | I realise i should have a lot of time and energy to devote to bbox at the moment, and am flailing a little faced with the code, looking at different applications. | 153 | I realise i should have a lot of time and energy to devote to bbox at the moment, and am flailing a little faced with the code, looking at different applications. |
|---|
| 154 | | 154 | |
|---|
| 155 | I should do a source release, which would help. i should also add a crawler and collector component to wirelesslondon; to init from openguides and then pick up the recent changes RSS. That would be useful, but wouldn't help with the implications of bbox as a bigger bit of software. | 155 | I should do a source release, which would help. i should also add a crawler and collector component to wirelesslondon; to init from openguides and then pick up the recent changes RSS. That would be useful, but wouldn't help with the implications of bbox as a bigger bit of software. |
|---|
| 156 | | 156 | |
|---|
| 157 | I've been holding out for interfaces like the ontomatic, because that does potentially really liberate me from the need to hack on cheesy web applications, much if at all. | 157 | I've been holding out for interfaces like the ontomatic, because that does potentially really liberate me from the need to hack on cheesy web applications, much if at all. |
|---|
| 158 | | 158 | |
|---|
| 159 | Experimenting with drupal and its RSS aggregator enlightened me as to the need for a monitor-feed-index. perhaps just an RSS bot that i could ask for status, for now. | 159 | Experimenting with drupal and its RSS aggregator enlightened me as to the need for a monitor-feed-index. perhaps just an RSS bot that i could ask for status, for now. |
|---|
| 160 | | 160 | |
|---|
| 161 | | 161 | |
|---|
| 162 | :Fri Apr 22 10:08:19 BST 2005 | 162 | :Fri Apr 22 10:08:19 BST 2005 |
|---|
| 163 | | 163 | |
|---|
| 164 | a simple way of doing user and feed management, basically. i wanted to allow people to hook in, or at elast model their own userdb. we have a lot of this code in wirelesslondon; needs plugged in to a simple deliciouslike API. we may as well bung a few template widgets for HTML into our handler for now, then abstract 'em out into the ontomatic later. o, and started an irc bot to do something like reporting, so i can ponder over monitoring functions. The idea is that the information about the latter should drop out of the model; if adequate info isn't contained in it, something is mildly wrong. | 164 | a simple way of doing user and feed management, basically. i wanted to allow people to hook in, or at elast model their own userdb. we have a lot of this code in wirelesslondon; needs plugged in to a simple deliciouslike API. we may as well bung a few template widgets for HTML into our handler for now, then abstract 'em out into the ontomatic later. o, and started an irc bot to do something like reporting, so i can ponder over monitoring functions. The idea is that the information about the latter should drop out of the model; if adequate info isn't contained in it, something is mildly wrong. |
|---|
| 165 | | 165 | |
|---|
| 166 | i just stole all the user account creation code from wl.user and dropped it into bbox and bbox.ui. This is defeinitely provoking me to wonder if i'm writing the same application. but i need to spike out of stasis at the moment. | 166 | i just stole all the user account creation code from wl.user and dropped it into bbox and bbox.ui. This is defeinitely provoking me to wonder if i'm writing the same application. but i need to spike out of stasis at the moment. |
|---|
| 167 | | 167 | |
|---|
| 168 | :Sun Oct 9 10:28:49 BST 2005 | 168 | :Sun Oct 9 10:28:49 BST 2005 |
|---|
| 169 | | 169 | |
|---|
| 170 | Good lord, i've been slack with this process. | 170 | Good lord, i've been slack with this process. |
|---|
| 171 | | 171 | |
|---|
| 172 | BBox changed a bit while i was writing nodel; now it only stores and queries geometry in wgs84, this seemed unnesc complex to be reprojecting. nodel uses bbox a lot, and there have been many small bugfixes to bbox in the process. | 172 | BBox changed a bit while i was writing nodel; now it only stores and queries geometry in wgs84, this seemed unnesc complex to be reprojecting. nodel uses bbox a lot, and there have been many small bugfixes to bbox in the process. |
|---|
| 173 | | 173 | |
|---|
| 174 | After i talked to Benoit Gregoire about it, i realised it should store full geometries for all types, there was only stub support for lines and polygons. i am adding that now, supporting a simple RSS serialisation like Mikel's one | 174 | After i talked to Benoit Gregoire about it, i realised it should store full geometries for all types, there was only stub support for lines and polygons. i am adding that now, supporting a simple RSS serialisation like Mikel's one |
|---|
| 175 | at http://brainoff.com/worldkit/doc/polygon.php . As spatial queries for bounding boxes were already being done by making a POLYGON and asking for stuff Within() it, this looks simple; the tests already pass; but now i have to go back through, fix the existing interfaces in bbox and get those passing again. | 175 | at http://brainoff.com/worldkit/doc/polygon.php . As spatial queries for bounding boxes were already being done by making a POLYGON and asking for stuff Within() it, this looks simple; the tests already pass; but now i have to go back through, fix the existing interfaces in bbox and get those passing again. |
|---|
| 176 | | 176 | |
|---|
| 177 | Then we need a plan for finding data. I know where there is a lot of data nearby me. in the past, i've collected it mostly using scripts - complete mirrors of the 'open guide to london', that kind of thing. now it really needs to be on an aggregation schedule. | 177 | Then we need a plan for finding data. I know where there is a lot of data nearby me. in the past, i've collected it mostly using scripts - complete mirrors of the 'open guide to london', that kind of thing. now it really needs to be on an aggregation schedule. |
|---|
| 178 | | 178 | |
|---|
| 179 | If we're going to make a nodel UI for bbox then we might as well make a very simple feed-status-manager as well, just a browsable view on fbox.Feed class objects. | 179 | If we're going to make a nodel UI for bbox then we might as well make a very simple feed-status-manager as well, just a browsable view on fbox.Feed class objects. |
|---|
| 180 | | 180 | |
|---|
| 181 | but a lot of aggregation events should actually be described by more codelike rules, and they are handled through nodel's API to different services which is much more sophis. than bbox's model of get feed, look for spatial stuff, remember it all. | 181 | but a lot of aggregation events should actually be described by more codelike rules, and they are handled through nodel's API to different services which is much more sophis. than bbox's model of get feed, look for spatial stuff, remember it all. |
|---|
| 182 | | 182 | |
|---|
| 183 | i would say a lot of this for now can be driven by a script on the cron that is explorign the model - get me all tags which an event is tagged with and look at the flickr feed for updates, and so on... get me everything from EVNT from different peoples changes and inboxes... | 183 | i would say a lot of this for now can be driven by a script on the cron that is explorign the model - get me all tags which an event is tagged with and look at the flickr feed for updates, and so on... get me everything from EVNT from different peoples changes and inboxes... |
|---|
| 184 | | 184 | |
|---|
| 185 | :Mon Oct 30 20:51:14 GMT 2006 | 185 | :Mon Oct 30 20:51:14 GMT 2006 |
|---|
| 186 | | 186 | |
|---|
| 187 | It's been a long time. | 187 | It's been a long time. |
|---|
| 188 | | 188 | |
|---|
| 189 | I'm digging this codebase out because: | 189 | I'm digging this codebase out because: |
|---|
| 190 | | 190 | |
|---|
| 191 | - Jamie King was asking about it | 191 | - Jamie King was asking about it |
|---|
| 192 | - Saul keeps mentioning it in the context of rebooting wirelesslondon | 192 | - Saul keeps mentioning it in the context of rebooting wirelesslondon |
|---|
| 193 | - there is a remote possibility that i might get paid for it | 193 | - there is a remote possibility that i might get paid for it |
|---|
| 194 | - mapufacture, bless their cotton socks, have no real incentive to release other than goodwill, they need a structure around them. | 194 | - mapufacture, bless their cotton socks, have no real incentive to release other than goodwill, they need a structure around them. |
|---|
| 195 | | 195 | |
|---|
| 196 | Now i could be contributing my time to egging on mapufacture as i could be to owslib as well. But i'm reminded that bbox is not far off finished. That it did work pretty well just had bad performance problems on serialisation, trying to haul around too many bulky and interconnected python objects at once. | 196 | Now i could be contributing my time to egging on mapufacture as i could be to owslib as well. But i'm reminded that bbox is not far off finished. That it did work pretty well just had bad performance problems on serialisation, trying to haul around too many bulky and interconnected python objects at once. |
|---|
| 197 | | 197 | |
|---|
| 198 | When i started thinking about a WFS-basic implmenetation my first thought is that would belong here. Also if one were thinking about writing a prototype video metadata aggregator - as i assume Jamie is though i thought they were stuck into prototyping right now, and i know some other people are working on a drupal based solution - though this looks more like drawing, socialising and planning for a big sprint in the spring. But by then they (the transmission.cc people) need something that they can be learning from issues with and using to demonstrate proof of value for their contributing participants. | 198 | When i started thinking about a WFS-basic implmenetation my first thought is that would belong here. Also if one were thinking about writing a prototype video metadata aggregator - as i assume Jamie is though i thought they were stuck into prototyping right now, and i know some other people are working on a drupal based solution - though this looks more like drawing, socialising and planning for a big sprint in the spring. But by then they (the transmission.cc people) need something that they can be learning from issues with and using to demonstrate proof of value for their contributing participants. |
|---|
| 199 | | 199 | |
|---|
| 200 | One issue Jan [sic?] had lamented was the lack of extensibility of the aggregators supplied with drupal. BBox as is, is pretty much the same - it collects a common core of properties well known to feedparser, plus geo:lat and geo:long - feedparser at least is catholic about what it extracts, as long as it's easy to configure what should be learned, this shouldn't be hard to change and will be useful. (i wonder how it handles a lot of atom extensions? - we'll also have to look for updates). | 200 | One issue Jan [sic?] had lamented was the lack of extensibility of the aggregators supplied with drupal. BBox as is, is pretty much the same - it collects a common core of properties well known to feedparser, plus geo:lat and geo:long - feedparser at least is catholic about what it extracts, as long as it's easy to configure what should be learned, this shouldn't be hard to change and will be useful. (i wonder how it handles a lot of atom extensions? - we'll also have to look for updates). |
|---|
| 201 | | 201 | |
|---|
| 202 | BBox is totally meant to be light footprint and i see the dependency on nodel crept into it for its http interfaces. This shouldn't have to be the case now - nodel though lovely was an overgrowth - can be replaced with the web.py currently in the geometa codebase. | 202 | BBox is totally meant to be light footprint and i see the dependency on nodel crept into it for its http interfaces. This shouldn't have to be the case now - nodel though lovely was an overgrowth - can be replaced with the web.py currently in the geometa codebase. |
|---|
| 203 | | 203 | |
|---|
| 204 | How does this one connect to that - both are doing the broker/decorator thing - only the other has a very specific schema. A WFS interface could be appropriate for both though geometa only has envisaged, not implemented support for individual vector features. | 204 | How does this one connect to that - both are doing the broker/decorator thing - only the other has a very specific schema. A WFS interface could be appropriate for both though geometa only has envisaged, not implemented support for individual vector features. |
|---|
| 205 | | 205 | |
|---|
| 206 | We should a/ work from the data - find a good collection of features that we need to treat of and work from there | 206 | We should a/ work from the data - find a good collection of features that we need to treat of and work from there |
|---|
| 207 | b/ figure out one directed thing that we can do and finish and that others will see benefit in, whether that is simplifying and extending bbox or extending and rethinking geometa. WFS-basic is super appealing though i am less sure how to implement the equivalent of OWSCat over it. This would be simple and impressive to do. I would not mind restricting this so that the data or at least an index of it has to be in PostGIS. One could index all the shapes in a shapefile as long as one had some way of referring consistently to the originals. But this swiftly starts to get into the domain of annotation system - a problem which looks the same as attaching potentially arbitrary properties and accompanying values to features and collections of them. This is why i keep thinking about bbox, because the arbitrariness is what the rdf store is for. With wfs-basic we don't need to mess around with geoserver and the allocation of URIs any more, and we get the facility of DescribeFeatureType to abuse how we like. | 207 | b/ figure out one directed thing that we can do and finish and that others will see benefit in, whether that is simplifying and extending bbox or extending and rethinking geometa. WFS-basic is super appealing though i am less sure how to implement the equivalent of OWSCat over it. This would be simple and impressive to do. I would not mind restricting this so that the data or at least an index of it has to be in PostGIS. One could index all the shapes in a shapefile as long as one had some way of referring consistently to the originals. But this swiftly starts to get into the domain of annotation system - a problem which looks the same as attaching potentially arbitrary properties and accompanying values to features and collections of them. This is why i keep thinking about bbox, because the arbitrariness is what the rdf store is for. With wfs-basic we don't need to mess around with geoserver and the allocation of URIs any more, and we get the facility of DescribeFeatureType to abuse how we like. |
|---|
| 208 | | 208 | |
|---|
| 209 | | 209 | |
|---|
| 210 | :Sat Nov 4 03:14:17 GMT 2006 | 210 | :Sat Nov 4 03:14:17 GMT 2006 |
|---|
| 211 | | 211 | |
|---|
| 212 | Property extensibility crossed my mind briefly while looking back through __init__.py and i see a long rush of stuff saying "if e.has_key('geo_lat')" etc etc. We definitely have to fix this. This is even worse as it occurs conditionally according to whether or not one has enabled the spatial index. I'm thinking about making the spatial index mandatory. | 212 | Property extensibility crossed my mind briefly while looking back through __init__.py and i see a long rush of stuff saying "if e.has_key('geo_lat')" etc etc. We definitely have to fix this. This is even worse as it occurs conditionally according to whether or not one has enabled the spatial index. I'm thinking about making the spatial index mandatory. |
|---|
| 213 | | 213 | |
|---|
| 214 | Part of this is because there's no date range query support here yet. I really thought there was; i think it dropped out over iterations when spatial query became all important. One can ask for N recent() things but that's based on date collected, not date emitted with the data. We look at the latter and store it in the RDF store in iCal format. Then the object going into the spatial store is decoupled, as above only if it's enabled. Without it, we can't do date range queries without resorting all the way to SPARQL. Goodness knows rdfobj should have an interface through to SPARQL in redland, which just wasn't stable back when Schuyler and i wrote it. (SPARQL has dateTime-less-than and dateTime-greater-than predicates, the syntax is messy and right at this minute i don't want to go there - i just want to get the baseline of WFS Simple implemented, no matter how nasty it looks inside for now, and worry later. | 214 | Part of this is because there's no date range query support here yet. I really thought there was; i think it dropped out over iterations when spatial query became all important. One can ask for N recent() things but that's based on date collected, not date emitted with the data. We look at the latter and store it in the RDF store in iCal format. Then the object going into the spatial store is decoupled, as above only if it's enabled. Without it, we can't do date range queries without resorting all the way to SPARQL. Goodness knows rdfobj should have an interface through to SPARQL in redland, which just wasn't stable back when Schuyler and i wrote it. (SPARQL has dateTime-less-than and dateTime-greater-than predicates, the syntax is messy and right at this minute i don't want to go there - i just want to get the baseline of WFS Simple implemented, no matter how nasty it looks inside for now, and worry later. |
|---|
| 215 | | 215 | |
|---|
| 216 | So basically i am adding a 'dated' datetime field to the index and that'll have to work just how within_box works now - construct the sql, run it, get a list of node identifiers back. | 216 | So basically i am adding a 'dated' datetime field to the index and that'll have to work just how within_box works now - construct the sql, run it, get a list of node identifiers back. |
|---|
| 217 | We should be able to pass date range limits into within_box or within_shape. Plus we want to be able to do date range queries without a box, for new. This just goes in the spatialStore.py module for now. Because that's an accreted mess which needs rewritten before any Shiny New Release, anyway. | 217 | We should be able to pass date range limits into within_box or within_shape. Plus we want to be able to do date range queries without a box, for new. This just goes in the spatialStore.py module for now. Because that's an accreted mess which needs rewritten before any Shiny New Release, anyway. |
|---|
| 218 | | 218 | |
|---|
| 219 | :Sun Nov 5 07:31:12 GMT 2006 | 219 | :Sun Nov 5 07:31:12 GMT 2006 |
|---|
| 220 | | 220 | |
|---|
| 221 | Right now i want to get to demo fastest, so I hooked this up to wirelesslondon's old database and copied the created dates to dated dates. there are 4.5K things in it which GetFeature by default won't deal very well with. | 221 | Right now i want to get to demo fastest, so I hooked this up to wirelesslondon's old database and copied the created dates to dated dates. there are 4.5K things in it which GetFeature by default won't deal very well with. |
|---|