There can’t be many people who have not heard of Twitter, but most people don’t realise that Twitter is the protocol and technology, rather than just a single website. There are a wealth of tools that all hook into Twitter feeds to provide better usability than the pretty basic Twitter site.
We’ve been beavering away and have created a system that will allow us to use Twitter feeds to create tools and sites targeting specific types of users, from consumer gossip through to corporate monitoring or online reputation management.
We basically split everything into three distinct chunks of functionality:
- Tweet Gatherer
This desktop application maintains the Since_ID for each separate query that we carry out, so that we are only ever looking from where we were last time we checked. It executes the query using the Search API, and manages a schedule so that queries that have generated results more recently get precedence over queries that haven’t. There is a descending scale, so that queries that have no results for more than 10 days only get checked occasionally, whereas queries with results in the last hour get checked more frequently. We can also insert new queries into the queue. - Tweet Publish
The desktop client is executing the queries, and maintaining the current tweet ID position, but when a result is found, it sends it to a web service (either hosted locally for testing, or remote for pushing to the live servers) . This makes it relatively easy to have a bank of PCs running different versions of the desktop client for different applications. - Front end website
The previous two steps ensure that the main website (or websites…) are relatively simple affairs. They simply query the database and process the content to do what we want, and fit into the layout.
Our first application to use this is Twit Parade ( a play on the phrase Hit Parade, NOT a slur on the people listed!), which monitors the tweets of a growing number of UK Celebrities.
Things to note: The Twitter search servers are a bit of a pain at times: it can take over 20 minutes for a tweet that you can see on a Twitter feed to appear via the search API. I assume there is some kind of caching in place, as if you tweak the query you send in a manner that wouldn’t affect it, such as adding the same query with an +OR+ will often return the result. Not guaranteed, however… Hopefully these issues will get less of an issue as Twitter scales. Maybe.
{ 1 comment }
Despite attempting different methods of querying the search API, there are certain users that simply can’t be found by Twitter. In our initial list of celebs, we had about 15 who simply didn;t exist when you tried a “from:xxx” search. There seems to be issues every now and then whereby a user can’t be found, but the problems we have seen might be due to people falling foul of the filters that Twitter applies - from which there is no recovery…
Which was a shame, as we wanted to make sure that we track as many celebs as possible, and wanted to follow Eddie Izzard, and his absolutely incredible marathon journey across the UK.
So a little bit of lateral thinking, and a new database field, and we’re up and running. The solution was to simply consume the public facing RSS feed available on a Twitter user’s page.
This is not as elegant as using the API calls, but it works, and allows us to track people who we otherwise wouldn’t. Some manual setups need to be done first: find the avatar image they have setup (the RSS feed doesn’t return their image), and then find the actual filename for the RSS file.
Everything else just slots into place. We still maintain the highest Tweet ID on a per user basis, and everything after the gatherer application stays the same. The result is pretty much the same.
So our gathering application has another feature, and we can track more people
Eddie Izzard on Twit Parade
Comments on this entry are closed.