| 1 | BBox, a feed collector with optional spatial index |
|---|
| 2 | -------------------------------------------------- |
|---|
| 3 | |
|---|
| 4 | BBox is an RSS / Atom / RDF feed aggregator. |
|---|
| 5 | |
|---|
| 6 | Please see the file INSTALL for instructions on setting up a BBox. |
|---|
| 7 | It details the software dependencies, mainly the Redland RDF tookit; |
|---|
| 8 | http://librdf.org/ , optionally PostGIS for a spatial index |
|---|
| 9 | (which will be replaced with a non-dependent, standalone one in future.) |
|---|
| 10 | |
|---|
| 11 | Setting PYTHONPATH |
|---|
| 12 | ------------------ |
|---|
| 13 | |
|---|
| 14 | These modules must be locatable by Python. |
|---|
| 15 | Set PYTHONPATH to wherever your main 'bbox' directory lives. |
|---|
| 16 | |
|---|
| 17 | #> export PYTHONPATH=~/consumotronic/bbox |
|---|
| 18 | |
|---|
| 19 | From within a script, set this by stating |
|---|
| 20 | |
|---|
| 21 | import sys |
|---|
| 22 | sys.path.append('/home/jo/consumotronic/bbox') |
|---|
| 23 | |
|---|
| 24 | Running BBox |
|---|
| 25 | ------------ |
|---|
| 26 | |
|---|
| 27 | A new BBox must be *bootstrapped* with an RDF model. |
|---|
| 28 | The file 'store/boot.rdf' contains a map of RDF namespaces |
|---|
| 29 | to aliases for them. |
|---|
| 30 | |
|---|
| 31 | Either edit bbox/config.py (instructions are in the INSTALL file) |
|---|
| 32 | or pass in the name of BBox's data store via the command line: |
|---|
| 33 | |
|---|
| 34 | Without config options set: |
|---|
| 35 | #> python store/boot.py /path/to/directory/and/filename |
|---|
| 36 | OR |
|---|
| 37 | #> python store/boot.py filename |
|---|
| 38 | |
|---|
| 39 | With config options set: |
|---|
| 40 | #> python store/boot.py |
|---|
| 41 | |
|---|
| 42 | |
|---|
| 43 | Using a BBox in application code |
|---|
| 44 | -------------------------------- |
|---|
| 45 | |
|---|
| 46 | from bbox import BBox |
|---|
| 47 | |
|---|
| 48 | bbox = BBox() # loads options from bbox/config.py |
|---|
| 49 | bbox = BBox(db='box2') # uses this filename as storage instead |
|---|
| 50 | bbox = BBox(spatial='bboxdb') # add optional spatial index |
|---|
| 51 | |
|---|
| 52 | m = rdfobj.Model('box2',db=1) |
|---|
| 53 | bbox = BBox(model=m) # uses this model instead of making its own |
|---|
| 54 | |
|---|
| 55 | Please see 'pydoc bbox' for interface documentation, or read |
|---|
| 56 | the section 'The BBox Console', below, and ask it 'help(bbox)' |
|---|
| 57 | |
|---|
| 58 | Please see 'pydoc rdfobj' or ask the console for 'help(rdfobj)' |
|---|
| 59 | for details on querying and manipulating BBox's data store. |
|---|
| 60 | |
|---|
| 61 | |
|---|
| 62 | Feeds that BBox will read |
|---|
| 63 | ------------------------- |
|---|
| 64 | |
|---|
| 65 | BBox uses Mark Pilgrim's feedparser - http://feedparser.org/ |
|---|
| 66 | This will eat all flavours of RSS, Atom, etc. To put a feed through |
|---|
| 67 | feedparser, |
|---|
| 68 | |
|---|
| 69 | bbox.subscribe('http://example.org/index.rss',format='rss') |
|---|
| 70 | |
|---|
| 71 | Bbox also uses a generic RDF parser, raptor from Redland - http://librdf.org/ |
|---|
| 72 | To put a file through raptor: |
|---|
| 73 | |
|---|
| 74 | bbox.subscribe('http://example.org/index.rdf',format='rdf') |
|---|
| 75 | |
|---|
| 76 | |
|---|
| 77 | The BBox Console |
|---|
| 78 | ---------------- |
|---|
| 79 | |
|---|
| 80 | The console is the quickest way to get started with BBox. |
|---|
| 81 | |
|---|
| 82 | #> python bbox/console.py filename |
|---|
| 83 | |
|---|
| 84 | This is a custom Python console with BBox, rdfobj and the set of |
|---|
| 85 | RDF namespace aliases loaded into it. |
|---|
| 86 | |
|---|
| 87 | Sample session: |
|---|
| 88 | |
|---|
| 89 | [jo@vishnu bbox]$ python bbox/console.py testing |
|---|
| 90 | BBox console (the BBox 'testing' is loaded as 'b') |
|---|
| 91 | b.subscriptions() |
|---|
| 92 | b.subscribe('http://example.org/test.rss',format='rss') |
|---|
| 93 | b.read_subscriptions |
|---|
| 94 | |
|---|
| 95 | For more help, try help(bbox) |
|---|
| 96 | |
|---|
| 97 | >>> b.subscribe('http://frot.org/devlog/index.rss',format='rss') |
|---|
| 98 | >>> b.subscribe('http://www.evnt.org/zool/changes.rss',format='rss') |
|---|
| 99 | >>> b.read_subscriptions() |
|---|
| 100 | >>> subs = b.subscriptions() |
|---|
| 101 | >>> for s in subs: print s.fbox_channel |
|---|
| 102 | ... |
|---|
| 103 | http://frot.org/devlog/index.rss |
|---|
| 104 | http://frot.org/bin/ghug/tag/nodel |
|---|
| 105 | |
|---|
| 106 | >>> for s in subs: print s.fbox_last_etag |
|---|
| 107 | ... |
|---|
| 108 | "1070424-e066-8a7dfd80" |
|---|
| 109 | None |
|---|
| 110 | >>> for s in subs: print s.fbox_last_modified |
|---|
| 111 | ... |
|---|
| 112 | Tue, 21 Jun 2005 19:34:30 GMT |
|---|
| 113 | None |
|---|
| 114 | |
|---|
| 115 | |
|---|
| 116 | BBox as simple RDF/RSS parser |
|---|
| 117 | ----------------------------- |
|---|
| 118 | |
|---|
| 119 | You can use bbox in applications without having to use its subscription |
|---|
| 120 | management / HTTP header support, just as a simple frontend to an RDF |
|---|
| 121 | data store. |
|---|
| 122 | |
|---|
| 123 | objects = bbox.read_rss('http://example.org/index.rss') |
|---|
| 124 | |
|---|
| 125 | objects = bbox.read_rdf('http://example.org/index.rdf') |
|---|
| 126 | |
|---|
| 127 | for o in objects: |
|---|
| 128 | print o.dc_title |
|---|
| 129 | # or whatever |
|---|
| 130 | |
|---|
| 131 | Both the read_rdf and read_rss methods return a list of rdfobj.Objects |
|---|
| 132 | These are python objects that keep their properties in the Redland |
|---|
| 133 | store, and can be address with a pythonic syntax. Short example |
|---|
| 134 | |
|---|
| 135 | >>> person = bbox.model.create(foaf.Person) |
|---|
| 136 | >>> print foaf.name |
|---|
| 137 | http://xmlns.com/foaf/0.1/name |
|---|
| 138 | >>> person[foaf.name] = 'Jo Walsh' |
|---|