AtomPub: Not Just For Blogs Anymore, Now Also Good For Blog Powered OPACs!

As I mentioned previously, the primary use case for Jangle at this stage is discovery systems. Not that it wouldn't be useful for other kinds of applications, but the lack of specification around Create, Update or Delete limit the immediate usefulness in certain situations. Also, the OPAC replacement is not only an immediate need for many libraries, it's also a pretty good demonstration of much of Jangle's functionality.

In my quest for projects that would provide simple targets for Jangle integration, I decided to take a stab at Scriblio, Casey Bisson's WordPress Plugin-cum-OPAC. It meets my basic proof-of-concept criteria: easy to install (WordPress has got to be the easiest web application to install ever), easy to maintain, easy to hack on. I had actually downloaded Scriblio sometime over the summer to play around with. More on that later.

Scriblio is a snap to install. After you install WordPress, just unzip the bSuite and Scriblio plugins in the wp-content/plugins directory, activate them in the Dashboard (wp-admin). That's it.

Of course, a blog powered OPAC without any catalog records in it is, well, just a blog. Scriblio includes a MARC record importer, utilizing the PHP-MARC parser from the Emilda project. The version I had also had a III importer and a SirsiDynix Horizon importer. Again, more on that later. The importers use the same sort of API hooks as importing from another blogging platform. I used the III importer template for my Jangle importer, although, in retrospect, it would probably have made more sense to use one of the standard WordPress importers that could already deal with Atom. Of course, "dealing with Atom" is drop-dead simple, so it wasn't a big deal to add that functionality myself.

The import uses the MARC21 binary feed, since I wasn't sure how well the PHP-MARC library could deal with MARCXML. I was also able to just piggyback on Casey's code base that way, as well, meaning less work for me. The Jangle Importer just harvests the records, deferring the actual importing to the regular Scriblio Catalog Importer. The import process is sllllllooooooooooooowwwwwwwwwww. Although most imports would only need to be incremental, sitting through the initial harvest for a collection of any size will take you the good part of a day. It seems like there's got to be a more efficient way to do this. The Jangle Importer requires a slight tweak to the {wp_prefix}_scrib_harvest table, changing the source field to a VARCHAR(255) to capture the entire Jangle URI. I realize theoretically a Jangle URI could be longer than 255 chars, but I'm going to ignore that for now.

While the Jangle Importer was written to be added as part of the Scriblio plugin, I decided to break out the item availability part as its own standalone plugin. It may be possible to move the importer into the Jangle availability plugin, but I wasn't sure if it was possible to use the existing importer scripts if I went that route.

The availability plugin loops through the posts displayed in the page and grabs the Item relation individually for each Resource. This is a seriously sub-optimal way to grab availability information, but I couldn't find any obvious way to iterate over the posts and grab the Jangle URIs beforehand and request the Items all at once. The plugin is another testament to how easy it is to add functionality like this, though, weighing in at less than 170 lines.

Jangle uses the "space" that Scriblio reserves for availability information (Casey has a post about it here) and overrides the scrib_availability function. The behavior is a little different depending on whether or not you're looking at record list page (like from a search) or a 'full record'.

When I got the plugin working, I installed WordPress and Scriblio on the demo site and successfully harvested the demo catalog. The availability plugin absolutely failed to work, however. It was then that I realized I had written everything against an old, outdated version of Scriblio (doh!). If I had spent more than a day and a half on it, I probably would have just deployed the old version of Scriblio on the demo site, but since it hadn't taken any time to speak of, I figured I might as well make the plugin work with the most recent revision of Scriblio.

So, let me take this opportunity to say that all of this works with Scriblio 2.7 b01. I cannot vouch for whether or not it will work with any other versions.

It took about another day to rewrite things to work with version 2.7. A lot of that time was spent trying to figure out why my changes weren't doing anything. Scriblio squirrels a lot of stuff away into the bSuite tables (which was the source of my problems), so if anything goes wrong and you want to start over again, be sure you empty those tables before you reload your records.

My main takeaway from this project is how much the follow the directions to install Scriblio. Do not use the Scriblio Catalog Importer.

Download the Jangle Importer and copy it into the Scriblio plugin directory.

In the WordPress admin Dashboard, go to "Utilities" -> "Import" -> "Scriblio Jangle Importer" and put your MARC21 binary Jangle feed in the form text box. Hit 'next' or whatever.

From this point on it's fine to use the regular Scriblio Catalog Importer. The Jangle importer just needed to create the {wp}_scrib_harvest table.

Once you've harvested successfully, install the Jangle Availability Plugin and activate it.

If you're using the Scriblio theme, overwrite the style.css file with this one. Otherwise the 'availability' divs will be set to display:none (nor will they have any formatting if you change that).

That's all it should take. If you want to "try before you buy", here is demo (although note that dust jacket images aren't appearing and search just flat out doesn't work -- don't know why).

Comments

Jangle Importer needs Marc_URL ?

Ross,
I tested your plugin with Scriblio , one step suggests as follows:
"The Jangle feed URL:
[http:// ]
example: http://demo.jangle.org/openbiblio/resources/?format=marc"
When we enter a url as "http://sirsi.lib.pku.edu.cn",errors occur:
Reading records from http://sirsi.lib.pku.edu.cn/. Please be patient.
Fatal error: Call to a member function getElementsByTagName() on a non-object in C:\wamp\www\wordpress\wp-content\plugins\scriblio\importer_jangle.php on line 219

Obviously,it needs marc records. But the condition is that we don't know the Marc URL for the moment(our ILS is Sirsi Unicorn),we have retrieval URL only (http://sirsi.lib.pku.edu.cn). We consulted customer service for Unicorn's general marc url rules,but receive no answer in the past 2 moths.

Do you have any general rules about Sirsi Unicorn marc url? Or could you have any suggestion on our entering format(as http://sirsi.lib.pku.edu.cn/????) to import our Unicorn records with Jangle Importer plugin?

Looking forward to your reply,Rossy.

************************
Benjun Zhu
Peking University Library,P.R.China
email:pkuzhuzhu(a)gmail.com