When catching up on my reading of Dr. Dobb's Journal, I came across an interesting article by Michael Owens about writing virtual tables for SQLite, which got me thinking about a small hack I've wanted to do for a while: a table that reads an RSS/Atom feed and presents the data to the query engine. Originally, I was planning to implement this as a MySQL Storage Engine, but since I was reading this article and the interface seems easy enough to work with, I decided to just whip together a simple prototype for SQLite instead. Since I currently don't have a good place to publish the repository, I have a distro available at http://www.kindahl.net/pub/sqlite-feedme-0.01.tar.gz.
After building and installing, the table can be created as simple as this:
mats@romeo:~/proj/feedme$ sqlite3 SQLite version 3.4.2 Enter ".help" for instructions sqlite> .load libfeedme.so sqlite> create virtual table onlamp ...> using feedme('http://www.oreillynet.com/pub/feed/8'); sqlite> select title from onlamp; PyMOTW: weakref What the Perl 6 and Parrot Hackers Did on their Christmas Vacation Least Appropriate Uses of Perl You've Seen YAP6 Operator: Filetests? WILFZ (What I Learned From Zope): Buildout TPT(Tiny Python Tip): Watch Jeff Rush's Videos PyCon 2008 Talks and Tutorials Finalized TPT(Tiny Python Tip): Python for Bash Scripters What the X-Files Taught Us about Real Aliens Python Web Framework Comparison: Documentation and Marketing Python Web Framework Comparison: Documentation and Marketing PyMOTW: mmap Improving Test Performance YAP6 Operator: Reduce Operators - Part II WSGI: Python Web Development's Howard RoarkNote that it is still a prototype. My plans are to at least:
- Read the entire feed into memory and parse it from there instead of writing the feed to disk before parsing it. Reading it to disk was the default for cURL, so I just stuck to that for the prototype (yeah, yeah. I know I'm lazy.)
- Allow the feed format to automatically be detected and set the parser accordingly. Right now, it can just handle Atom feeds, and does not do a great job at that either.
- Figure out a way to present multiple entries data in a useful way. For example, an entry can hold several links, but which one is really the interesting one?