Tag Archives: museums

Tweets, Geospatial analysis, giraffes and a little bit of museums for good measure: What I got from the Casa Seminar: Harvesting the Crowd

3 Mar

Yesterday I went to my first CASA seminar, and it was great!  Well the second half was, the first half involved a lot of equations about Thermodynamics and I didn’t have a clue what was going on, and I chose an unfortunate seat near the front, I was slightly terrified that he might ask the group to solve said equations. I have trouble adding up numbers let alone Greek letters.  I’m sure it was brilliant, but unless you have a firm basis in maths or thermodynamics I don’t think you would have stood a chance.  Then came the good stuff, the stuff I understand, and the stuff that makes me happy: Social Media, visualisations, maps (I am a closet geospatial nerd who has no geospatial abilities- I’m like superman on kryptonite) animal logos and to my happy surprise a bit about museums.

Steve Gray and Fabian Neuhaus provided an overview of the tools in CASA’s crowd sourcing toolkit; SurveyMapper, Tweet-o-Meter and the Twitter Collection tools.  There has been a massive explosion of handheld mobile devices with GPS as well as a move to crowdsourcing info this has produced a heck of a lot of online geospatial data.  Add newly released public sector data and you get yourself an exciting situation where people can take that data and turn it into something more interesting.  CASA work on integrating tools for unlocking, exploiting, understanding and sharing new data sets and to also enable users have a go at mapping and spatial analysis.

Firstly Steve talked about Survey Mapper –  a real time geographic survey tool.  What I like about survey mapper is that it doesn’t take itself too seriously.  It knows it’s doing clever stuff behind the scenes, but presents a friendly giraffe – you can’t not love a giraffe – with an easy to use interface.  One of the survey’s Steve discussed was the BBC’s Look East Broadband speed survey which produced a lot of responses over 6500 in a day I think (I might be wrong on that one).

Tweet-o-meter – This is genius and really beautiful too.

Tweet-o-Meter is powered by CASA, as part of the NeISS project. Created by Steven Gray

Tweet-o-meter harvests geospatial data from Twitter with the aim of creating a series of new city maps based on Twitter data.  Data is collected from tweets sent via a mobile device that includes the location at the time of sending the tweet. Via a radius of 30km around different cities, for example the number of Tweets have been collated to create New City Landscape Maps of London, New York and Paris.

Created by Urban Tick

I think this is a beautiful analogy for twitter activity where contours correspond to the density of tweets, mountains rise in active twitter locations and cliffs drop down in to valleys of tweet deserts.

UrbanTick has the full set of the different new city landscapes, all available in  Google Maps viewer (I think)- head over to take a look at the gorgeousness.

Steve and Fabian discussed that there are now 60 cities around the world that have their tweets monitored over the period of one week.  Amsterdam is a top tweeter with over 50% geolocated tweets. Whereas London which is still a really active city send on average send about 10%  geolocated tweets. Visualisations clearly showed that different cities seem to be more active in the morning and others in the evening, producing some lovely looking kidney shapes.  The data also shows that different days of the week are more conducive to tweeting, for example  Monday and Tuesday are generally less active than the rest of the week.

Data was also collected during the early days of the Egyptian revolution in Cairo.  It was really interesting to see how the protests and internet blackout affected twitter activity.  For example when the big internet switch was flicked back on the data shows an immediate rise in geolocated tweets.

And then came something that I got really excited about and something that I could really use in my PhD… Andy, Steve, Fabian … if you’re reading this, can you show me how to do it? Pretty please!? I will buy you cake. Lots of it.

Tweeting art -  Most museums are now using Twitter and CASA have taken that information and turned it into really awesome spider like explosions of communication network visualisations.  Showing how different museums (the examples given included Tate and MoMA) link in to the wider twitter network and also how they link to each other  so in essence how the institutions interact with other users and how this connects them into an entangled social network.  For example Tate and MoMa tweet to roughly the same followers but don’t really tweet to each other.   I think this is fascinating, particularly if it can show if museums are only using Twitter as a broadcast medium – pushing marketing out, or whether they are creating a engaging discussions and digital experiences with their followers.

It was fascinating to be part of the seminar not only where people were talking about Twitter in a active exciting research context and sensible manner, and where questions from the audience were serious, and probing and engaged in the topic.  Rather than asking ridiculous questions from anti social Media people where Twitter is a waste of time, full of pointless babble which makes yooths mediocre there was brilliant questions about how do you model for uncertainty, what proportion of users geotweet? Does this skew the data?what about frequency and text mining to find out more about context.  It was brilliant.  I was engaged. And it reminded me that I am not out of my depth in this whole digital humanities thang, I do know what I am talking about, and this is a growing research field doing so much cool stuff, I am an alt academic and proud.

Notes from the Cultural Heritage and the Semantic Web day

25 Jan

I promised a few people that I would take copious notes at the British Museum Semantic Web event last week. To be honest, I don’t really understand the semantic web.  Shocking for a museums web geek! I know that Linked data is a good thing, but I couldn’t really tell you why, or how on earth you go about doing it.  So I went along with the hope that I would become well informed and least to be able to do more than nod and smile when someone mentions semantic webness. The whole point of the event was to focus on projects that are already using semantic web technology. I’m quoting from the programme here “By presenting a more practical insight into the use of the semantic web in the sector it is hoped that the current gap between the technologists and others who stand to benefit from the technology can be bridged.”

I don’t quite know if they managed that.  It did seem like it was semantic web people talking to other semantic web people. But perhaps that is how it has to happen at first for people in the know to discuss and ponder before it can filter down successfully to the rest of us.

First up was Wendy Hall.  Wendy showed us lots of pictures of conferences she attended in the 80s and 90s…

But then said some interesting things like “Scruffy Works”.  Suggesting that you need to let links fails to make it scale.  You shouldn’t aim for your linked data to be perfect first time round, there has to be room for experiment and exploration. The network is everything, and open and free standards are hugely important was also one of her key points.  Wendy also talked about 5star data; W3 has a handy mug to explain the 5 start system of good linked data.

Two keys questions came out of Wendy’s talk: who successful have Cultural Heritage practitioners been in working to develop and use semantic web tech? And where are the starting points?

Kenneth Hamma taught me the wonderful word bumbulum. And discussed zen and the art of the internet following up with talking about The wrong containers.  The notion of ‘my collection’ silos of information, gatekeepers and information containers.    These don’t really work in the physical world why have we as museums extended this reasoning and practice to the Web? That the semantic web breaks this tradition.  There has to be this action of ‘letting go’.  And encouraged us to Imagine what people will do with museum data if you let it go, and allow all museum information to be joined up.  This of course isn’t without its challenges and starting being open and free with data is difficult when you have to take the jump and ‘let go’.

John Sheridan from The National Archives spoke brilliantly about data.gov.uk and legislation.gov.uk.    I like the fact that the TNA is beginning to be classed as a sort of semantic knowledge base, which operated the UK government website archive.  Which is the 2nd most used web archive in the world. John spoke about developing standards for responsible publishing of key types of data, showing commitment to publishing in open standards, and the National Archives have taken this opportunity to publish data in Linked Data form and make it available via the data.gov.uk website.  This in turns makes it easy for people to consume date in a programmatic way; developing Linked Data APIs with the facility to deliver data in multiple formats, as well as native linked data.

John also spoke about having appropriate standards for different levels, one thing I really liked was the idea concept of ‘re-use where we can, create where we must’.  John also demonstrated data cleansing with  Google Refine particularly because non coders types like myself can publish RDF data by clicking a few buttons without having to use any complicated </>’s.

Hugh Glaser from Seme4 started with the idea of time and location being very important.  But what is more important is Knitting everything together in order for it it make sense.  Firslt mentioning the BBC’s dynamice semantic publishinhg of the World Cup coverage using RDF.  Hugh then went on to talk about the classic data fusion problem, existing at  many museums and other organisations where many separate silos exist within the organisation.  The British Museum Collection Online (COL) is a prime example.  The cataloguing data is in one database, the conservation data will be in another, the acquisition data somewhere else, and the science data in yet another.  Using some very clever ontology all of that data is now tabbed at the bottom of catalogue entries.  Now I found this fascinating, why? Well I’ve done some work on the Info seeking behaviour of users of the BM’s COL and not one of them mentioned the linked data.  Nor did I notice any mention of this on the COL itself.  My worry is that not enough people understand how linked data works, myself included, and that nifty things like all the data from lots of different databases about the Rosetta Stone being linked together in one place, is being overlooked, and possibly more importantly not being shouted about.

Hugh went on to demonstrated the Resist Knowledge Base (RKB) and RKB explorer, which is a knowledge enabled infrastructure which displays semantic relationships of individuals.

Hugh then stated that linked data was bringing ‘added value’ because of the more sophisticated services, in an open system means you don’t have to do everything yourself.  Added value to whom? How for example do semantic relationships deal with provenance data, object biographies, mapping of historical data?

Atanas Kirakov had a brilliant analogy for the Semantic web… it is like teenage sex.  Lots of people talk about it, not that many actually do it, and for those that does it is a less than satisfactory experience.

Linked Data is hard for people to comprehend, and its sheer diversity is problematic. Linked Data Web is unreliable, most of the servers are slow because dealing with distributed data on the web is slow, leading to high down times.

I liked Altanas’ talk, it was straight to the point.  Linked data is a good idea.  He does believe that linked data adds value to proprietary data through better description whilst being able to make data more open..  But in practice it isn’t well used because there are no well established opinions about what exactly linked data can ‘buy’ businesses.  There is a need to facilitate better data integration, and provide additional public information which can help alignment and linking info up.

Jonathan Whitson Cloud started his talk ‘it would be a shame to come to the BM and not talk about objects’ and then used a couple of lovely looking cuneiform tablets as examples.  Explaining it all started with structured data…

Jonathan went on to discuss the conservation and scientific research documentation project.  Stating that adding taxonomy afterwards is quite tricky. Showing that there are lots of concerns around sharing data, but at least there has been a lot of talking about it. Different types of people have different types of issues about sharing data and then linking it up:

The British Museum has its reputation to uphold, and likes being the first to do things.   The conservators are concerned about data quality, data protection, academic process and ownership, personal as well as institutional reputation.  The scientists are concerned about academic process and ownership, data content, previous failed systems, effort vs reward, data quality, personal and institutional reputation.   Documentation and IS are concerned about data quality, hierarchies and thesauri used. The list could go on.  A lot to think about when producing a business case for sharing data.  However the process of moving towards sharing data acts as a catalyst for data cleaning and structuring.  BM collections data is being structured and stored semantically by the end of Feb 2011. I did notice that with all this talk of people, not once was the ‘end user’ mentioned.  It’s all well and good restructuring data for internal sharing and a more cohesive organisation, with nice linked data.  But what does that mean for the web user who wants to find out about a specific object and its location in the museum?  I have a big scribble in my note pad (laptop battery fail) simply saying WHAT ABOUT THE USERS?

Leif Isaksen gave an interesting talk about the past, present and future of semantic web in cultural heritage. Technology for data integration is not tech which is changing society, not the volume of data but a positive feedback loop of information exchange is what is important.  Where culture is the textural and material artefacts we chose to exchange information about. However society has an overwhelming interest in popular culture which is sidelining cultural heritage.  Leif then went on to describe a semantic ecosystem:

  • Entity services ( British Museum, London)
  • Ontology services (building, city, is located in)
  • Data services ( British Museums, is located in London)

Dominic Oldman spoke about The Research Space, which aims to support scholarly research online, VRE anyone?

Research space is an environment which aims to generate new knowledge by collaboration.  By creating a research collaboration and digital publication environment, bringing together data collaboration and research tools into one space.  Blogging tools, forums, and wikis alongside the RDF imported data.

So by the end of it all, I was quite confused and I still don’t know my SPARQL from my CIDOC, sounds like a Saturday night involving sequins and hiccups to me. But I am glad I went, particularly as I am not alone in my lack of understanding this whole Semantic web thing.  But if more people like John Sheridan and Atanas Kiryakov can hold more sessions explaining this pesky much talked about but not much done linked dataitus it would go a long way to solving some of the ‘Buy In’ issues and make people feel less dumbfounded by it all.

Follow

Get every new post delivered to your Inbox.

Join 1,456 other followers