Tuesday, 29 April 2008

Talking to IT students about the cultural heritage sector, and a small 'woot'

I've just written a report of a visit I made with June (our diversity manager) and Bilkis (our web content manager) to Kingston University to talk to students from the Faculty of Computing, Information Systems and Mathematics about the role of IT professionals in museums.

The full post is on the Museum of London blog ('Why should IT students consider working in cultural heritage?') but I thought it was worth linking to here because the discussion raised lots of interesting questions that might benefit from a wider audience:
How can we engage with our audiences? How would you challenge us, as a museum, do to a better job? Is there obvious stuff we’re missing? Do you have an idea for a project a museum could work with you on? Do you want to contribute to our work? Do you have any more questions about museum jobs?

On a more theoretical level, what effect might new methods of collecting objects or stories have - does it create a new kind of visibility for content from IT literate people with reliable access to the internet? How can we engage with people who aren’t comfortable online?
I think I got more out of the session than the students did, and it's nice to think that one or two of them might consider working in a museum when they graduate.

And the small 'woot'? This blog has been listed as an example of a 'programming and development blog' in the ComputerWeekly.com IT Blog Awards 08. I have no idea how that happened, but it's very flattering.

Saturday, 26 April 2008

Notes from 'Maritime Memorials, visualised' at MCG's Spring Conference

There are my notes from the data burst 'Maritime Memorials, visualised' by Fiona Romeo, at the MCG Spring meeting. There's some background to my notes about the conference in a previous post. Any of my comments are in [square brackets] below.

Fiona's slides for 'Maritime Memorials, visualised' are online.

This was a quick case study: could they use information visualisation to make more of collections datasets? [The site discussed isn't live yet, but should be soon]

A common visualisation method is maps. It's a more visual way for people to look at the data, it brings in new stories, and it helps people get sense of the terrain in e.g. expeditions. They exported data directly from MultiMimsy XG and put it into KML templates.

Another common method is timelines. If you have well-structured data you could combine the approaches e.g. plotting stuff on map and on a timeline.

Onto the case study: they had a set of data about memorials around the UK/world. It was quite rich content and they felt that a catalogue was probably not the best way to display it.

They commissioned Stamen Design. They sent CSV files for each table in the database, and no further documentation. [Though since it's MultiMimsy XG I assume they might have sent the views Willo provide rather than the underlying tables which are a little more opaque.]

Slide 4 lists some reasons arguments for trying visualisations, including the ability to be beautiful and engaging, provocative rather than conclusive, appeal to different learning styles and to be more user-centric (more relevant).

Some useful websites were listed, including the free batchgeocode.com, geonames and getlatlong.

'Mine the implicit data' to find meaningful patterns and representations - play with the transcripts of memorial texts to discover which words or phrases occur frequently.

'Find the primary objects and link them' - in this case it was the text of the memorials, then you could connect the memorials through the words they share.

The 'maritime explorer' will let you start with a word or phrase and follow it through different memorials.

Most interesting thing about the project is the outcome - not only new outputs (the explorer, KML, API), but also a better understanding of their data (geocoded, popular phrases, new connections between transcripts), and the idea that CSV files are probably good enough if you want to release your data for creative re-use.

Approaches to metadata enhancement might include curation, the application of standards, machine-markup (e.g. OpenCalais), social tagging or the treatment of data by artisans. This was only a short (2 - 3 weeks) project but the results are worth it.

[I can't wait to try the finished 'explorer', and I loved the basic message - throw your data out there and see what comes back - you will almost definitely learn more about your data as well as opening up new ways in for new audiences.]

Thursday, 24 April 2008

Notes from 'Unheard Stories – Improving access for Deaf visitors' at MCG's Spring Conference

These are my notes from the presentation 'Unheard Stories – Improving access for Deaf visitors' by Linda Ellis at the MCG Spring Conference. There's some background to my notes about the conference in a previous post.

Linda's slides for Unheard Stories – Improving access for Deaf visitors are online.

This was a two year project, fit around their other jobs [and more impressive for that]. The project created British Sign Language video guides for Bantock House. The guides are available on mp3 players and were filmed on location.

Some background:
Not all 'deaf' people are the same - there's a distinction between 'deaf' and 'Deaf'. The notation 'd/Deaf' is often used. Deaf people use sign language as their first language and might not know English; deaf people probably become deaf later in life, and English is their first language. The syntax of British Sign Language (BSL) is different to English syntax. Deaf people will generally use BSL syntax, but deaf people might use signs with English grammar. Not all d/Deaf people can lip-read.

Deaf people are one of the most excluded groups in our society. d/Deaf people can be invisible in society as it's not obvious if someone is d/Deaf. British sign language was only recognised as an official language in March 2003.

Their Deaf visitors said they wanted:
Concise written information; information in BSL; to explore exhibits independently; stories about local people and museum objects; events just for Deaf people (and dressing up, apparently).

Suggestions:
Put videos on website to tell people what to expect when they visit. But think about what you put on website - they're Deaf, not stupid, and can read addresses and opening hours, etc. Put a mobile number on publicity so that Deaf people can text about events - it's cheap and easy to do but can make a huge difference. If you're doing audience outreach with social software, don't just blog - think about putting signed videos on YouTube. Use local Deaf people, not interpreters. Provide d/Deaf awareness training for all staff and volunteers. Provide written alternatives to audio guides; add subtitles and an English voice over signed video if you can afford it.

Notes from Museums Computer Group (MCG) Spring Conference, Swansea

These are my notes from the Museums Computer Group (MCG) Spring meeting, held at the National Waterfront Museum, Swansea, Wales, on April 23, 2008.

Nearly all the slides are online and I also have some photos and video from the National Waterfront Museum. If you put any content about the event online please also tag it with 'MCGSpring2008' so all the content about this conference can be found.

The introduction by Debbie Richards mentioned the MCG evaluation project, of which more later in 'MCG Futures'.

I have tried to cover points that would be of general interest and not just the things that I'm interested in, but it's still probably not entirely representative of the presentations.

Debbie did a great job of saying people's names as they asked questions and I hope I've managed to get them right, but I haven't used full names in case my notes on the questions were incorrect. Please let me know if you have any clarifications or corrections.

If I have any personal comments, they'll be in [square brackets] below. Finally, I've used CMS for 'content management systems' and CollMS for 'collections management systems'.

I've made a separate post for each paper, but will update and link to them all here as I've make them live. The individual posts include links to the specific slides.

'New Media Interpretation in the National Waterfront Museum'

'Catch the Wind: Digital Preservation and the Real World'

'The Welsh Dimension'

'Museums and Europeana - the European Digital Library'

'MCG Futures'

'Building a bilingual CMS'

'Extending the CMS to Galleries'

'Rhagor - the collections based website from Amgueddfa Cymru'

'Maritime Memorials, visualised'

'Unheard Stories – Improving access for Deaf visitors'

'National Collections Online Feasibility Study'

Notes from 'National Collections Online Feasibility Study' at MCG's Spring Conference

These are my notes from Bridget McKenzie's presentation, 'National Collections Online Feasibility Study' at the MCG Spring meeting. Bridget's slides are online: National Collections Online Feasibility Study'. There's some background to my notes about the conference in a previous post. Any of my comments are in [square brackets] below.

The partners in the National Collections Online Feasibility Study are the National Museum Director's Conference, the V&A, the National Museum of Science and Industry, the National Maritime Museum, and Culture 24 (aka the 24 Hour Museum).

The brief:
Is it possible to create a discovery facility that integrates national museum collections; provides seamless access to item-level collections; a base on which build learning resources and creative tools? And can the nationals collaborate successfully?

The enquiry:
What's the scope? What's useful to different partners? What can be learnt from past and current projects? How can it help people explore collections? How can it be delivered?

There's a workshop on May 9th, with some places left, and another on June 18th; reports at the end of May and July.

Community of enquiry... people from lots of different places.

What are they saying?
"Oh no, not another portal!"
"You need to go to where the eyeballs are" - they're at Google and social networking sites, not at portals (but maybe at a few museum brands too).

It has to be understood in the context of why people visit museums. We don't know enough about how people use (or want to use) cultural collections online.

There's some worry about collaborative projects taking visits from individual sites. [Insert usual shtick about the need to the online metrics for museums to change from raw numbers to something like engagement or reach, because this is an institutional concern that won't go away.]

"Don't reinvent the wheel, see how other projects shape up": there's a long list of other projects on slide 9!

It's still a job to understand the options, to think about they can be influenced and interoperate.

"We have to build the foundations first"
Needs: audience research - is there a market need for integrated collections?; establish clarity on copyright [yes!]; agreement on data standards; organisational change - communicate possibilities, web expertise within museums; focus on digitising stuff and getting it out there.

[re: the audience - my hunch is that most 'normal' people are 'museum agnostic' when they're looking for 'stuff' (and I mean 'stuff', not 'collections') - they just want to find 18th century pictures of dogs, or Charles and Di wedding memorabilia; this is different to someone looking for a 'branded' narrative, event or curated experience with a particular museum.]

"Let's just do small stuff"
Need to enable experiment, follow the Powerhouse example; create a sandbox; try multiple approaches - microformats, APIs, etc. [Woo!]

Does a critical mass of experimentation mean chaos or would answers emerge from it?

What does this mean?
Lots of options; questions about leadership; use the foundations already there - don't build something big; need an market- or audience-led approach; sector leadership need to value and understand emerging technology.

Notes from 'Rhagor - the collections based website from Amgueddfa Cymru' at MCG's Spring Conference

There are my notes from the presentation 'Rhagor - the collections based website from Amgueddfa Cymru' by Graham Davies at the MCG Spring meeting.

This paper talked about the CMS discussed in Building a Bilingual CMS.

'Rhagor' is Welsh for more - the project is about showing more of the collections online. It's not a 'virtual museum'.

With this project, they wanted to increase access to collections and knowledge associated with those collections; to explain more about collections than can be told in galleries with space limitations; and to put very fragile objects online.

[He gave a fascinating example of a 17th century miniature portrait with extremely fragile mica overlays - the layers have been digitised, and visitors to the website can play dress-up with the portrait layers in a way that would never be physically possible.]

The site isn't just object records, it also has articles about them. There's a basic article structure (with a nice popout action for images) that deals with the kinds of content that might be required. While developing this they realised they should test the usability of interface elements with general users, because the actions aren't always obvious to non-programmers.

They didn't want to dumb down their content so they've explain with a glossary where necessary. Articles can have links to related articles; other parts of the website and related publications, databases etc. Visitors can rate articles - a nice quick simple bit of user interactivity. Visitors can share articles on social networking sites, and the interface allows them to show or hide comments on site. Where articles are geographically based, they can be plotted onto a map. Finally, it's all fully bilingual. [But I wondered if they translate comments and the replies to them?]

In their next phase they want to add research activities and collections databases. They're also reaching out to new audiences through applications like Flickr and Google Earth, to go to where audiences are. If the content is available, audiences will start to make links to your content based on their interests.

The technology itself should be invisible, user has enriched experience through the content.

Questions:
Alex: to what extent is this linked with collection management system (CollMS)? Graham: it's linked to their CMS (discussed in earlier papers), not their CollMS. They don't draw directly from CollMS into CMS. Their CollMS is working tool for curators, needs lots of data cleaning, and doesn't necessarily have the right content for web audiences; it's also not bilingual.

Notes from 'Extending the CMS to Galleries' at MCG's Spring Conference

These are my notes from the presentation, Extending the CMS to Galleries by Dafydd James at the MCG Spring meeting. Dafydd's slides for Extending the CMS to Galleries are online. There's some background to my notes about the conference in a previous post. Any of my comments are in [square brackets] below.

This paper talked about extending the CMS discussed in Building a Bilingual CMS. See also notes from the following talk on 'Rhagor - the collections based website from Amgueddfa Cymru'.

Oriel I [pronounced 'Oriel Ee' rather than 'Oriel one' as I first thought] is an innovative and flexible gallery, created under budget constraints. Dafydd worked with the curatorial departments and exhibition designer.

It feeds 15 interactive touchscreens, 7 video streams, sound, content can be updated by curatorial department. They're using Flash, it was a better option at time than HTML/Javascript, and it can be used alongside PHP for data.

They assigned static IP addresses to all PCs in gallery. Web pages ran in kiosk software on Windows XP PCs.

They had to get across to curators that they didn't have much room for lots of text, especially as it's bilingual. The system responds quickly if user interacts - on release action, though interactions need to be tested with 'normal' people. Pre-loading images helps.

Future plans: considering changing some of the software to HTML/Javascript, as there are more Javascript libraries are available now, and it can be faster to load, and it's open source. Also upgrading to a newer version of Flash as it's faster.

They're looking at using Linux, they want more flexibility than Site Kiosk which uses an IE6 engine.

They're thinking about logging user actions to find out what the popular content is, get user feedback, and they're trialling using handhelds with the CMS to deliver smaller versions of webpages.

Notes from 'Building a bilingual CMS' at MCG's Spring Conference

These are my notes from Chris Owen's presentation, 'Building a bilingual CMS' (for the National Museum of Wales) at the MCG Spring meeting. Chris' slides for 'Building a bilingual CMS' are online. There's some background to my notes about the conference in a previous post. Any of my comments are in [square brackets] below.

Why did they build (not buy) a CMS?
Immediate need for content updating, integration of existing databases.
Their initial needs were simple - small group of content authors, workflow and security weren't so important.
Aims: simplicity, easy platform for development, extensible, ease of use for content authors, workflow tailored to a bilingual site, English and Welsh kept together so easier to maintain.

It's used on main website, intranet, SCAN (an educational website), Oriel I (more on that later in a later talk), gallery touch-screens and CMS admin.

The website includes usual museum-y stuff like visitor pages, events and exhibitions, corporate and education information, Rhagor (their collections area - more on that later too) and blogs.

How did they build it?
[In rough order] They built the admin web application; created CMS with simple data structures, added security and workflow to admin, added login features to CMS, integrated admin site and CMS, migrated more complex data structures, added lots of new features.

They developed with future uses in mind but also just got on with it.

Issues around bilingual design:
Do you treat languages equally? Do you use language-selection splash screens or different domain names?
Try to localise URLs (e.g. use aliases for directories and pages so /events/ and /[the Welsh word for events]/ do the same [appropriate] thing and Welsh doesn't seem like an afterthought). Place the language switch in a consistent location; consider workflow for translation, entering content, etc.

Use two-character language codes (en/cy), organise your naming conventions for files and for database fields so Welsh isn't extra (e.g. collections.en.html and the equivalent .cy.html); don't embed localised strings in code. [It's all really nicely done in XML, as they demonstrated later.]

Coding tip: pull out the right lang at the start in SQL query; this minimises bugs and the need to refer to language later.

It's built on XML, as they have lots of databases and didn't want to centralise/merge them together; this means they can just add news ones as needed.

Slide 16 shows the features; it compares pretty well to bigger open-source projects out there. It has friendly URLs, less chance of broken links, built in AJAX features and they've integrated user authentication, groups so there's one login for whole website for users. The site has user comments (with approval) and uses reCaptcha. There's also a slide on the admin features later - all very impressive.

They've used OO design. Slide 18 shows the architecture.

Content blocks are PHP objects - the bits that go together that make the page. Localised. Because links are by ID they don't break when pages are moved. They're also using mootools.

The future: they want to have more user-centric features; work with the [Welsh project] People's Collection and other collaborations; APIs with other sites for web 2.ish things; more use of metadata features; they will make it open source if people think it's useful.

They would really open it up, via something like sourceforge, but would take lead.

[Overall it's a really impressive bit of work, sensibly designed and well implemented. Between that and the Indianapolis Museum of Art I've seen some really nice IT web tools in museums lately. Well done them!]

Notes from 'MCG Futures' at MCG's Spring Conference

These are my notes from 'MCG Futures' Jon Pratty and Ross Parry (presented by Jon Pratty) at the Museums Computer Group (MCG) Spring Conference. There's some background to my notes about the conference in a previous post. If I've made any comments below they're in [square brackets]. The slides for MCG Futures are online.

Jon: in this presentation will outline the evaluation process, present some of the feedback to date and the timescales.

We've got our work cut out getting level and quality of feedback we need.

Disclaimer: he's not presenting the personal views of Jon or Ross; but presenting what the membership think so far.

The MCG has an 'astonishing heritage' of meetings and discussions held across the country and throughout the year [slide 2 is a list of all the meetings - this is the 51st]. There's a rich archive of content, proceedings, papers, etc. The MCG has a valuable archive, culture, way of working and communal history.

As the web starts to move faster than the organisation, what do you do? What does the momentum of technology mean. Are we keeping up with changes?

Slide 3 is the timetable for consultation, formulation and action - changes to be agreed at the AGM in autumn 2008. Slides 5 - 10 present some of the feedback so far.

Are we reaching out far and hard enough? They've had 20 - 25 specific feedback emails, fewer from the online form. They will be asking other organisations how they do it to so can get more feedback. The big steps that might be coming require feedback from bigger sample of membership. [So if you want to see change, you have to send comments! It's ok to be critical, and it's ok to write about what you like already.]

The Autumn meeting will be crucial - if there are going to be changes, information has to go out before the Annual General Meeting so the membership have time and notice to consider those changes.

It's going to happen relatively quickly - it's 'not a long period of navel-gazing'.

Some thoughts based on comments so far:
Is MCG a collection of voices or a unified voice?
Do we set agendas or reflect them?
Are we as technologists [doers] disenfranchised from the people who make decisions?
Web or print?
What's the role of newsletter?
Would you want a blog? (As this asks more of a group of people or MCG committee if so, how would that work?)
What about membership fees.

Questions - what about the:
Function: advocacy, research, collaboration?
Governance: structure, responsibility, size?
Interactions: frequency, location, focus?
Membership: composition, benefits, cost?
Outputs: newsletter, reports, web?
Affiliations: professional, governmental, commercial?

[If you missed the first call for feedback, you can email Debbie Richards, use the feedback form, or discuss it on the MCG list. It doesn't matter if you're not a member, or not in IT/a technologist or not a web person - your opinion is valuable.]

The UK Museums and the Web Conference will be at the University of Leicester on June 19, 2008.

Notes from 'Museums and Europeana - the European Digital Library' at MCG's Spring Conference

These are my notes from David Dawson' presentation 'Europeana - Museums and the European Digital Library' at the MCG Spring Conference. There's some background to my notes about the conference in a previous post. If I've made any comments below they're in [square brackets].

David's slides for 'Europeana - Museums and the European Digital Library' are online.

Europeana is new name for the European Digital Library (EDL).

The EDL is a political initiative - part of i2010 Eu's IT strategy. It will provide a common point of multilingual access to online 'stuff'. It includes the TEL project (The European Library - catalogue records of national libraries) and MICHAEL.

The Europeana 'maquette' was launched in February, showing how might work in a few years time. 'One or two little issues still need working on'. 'Themes' aren't really being taken forward. It has social tagging (going into faceted browsing [did I get that right?]). Works around who, what, where and when, and includes a timeline. It will have 7 million pieces of content.

Europeana and MICHAEL (multilingual websites/digital collections from cultural heritage sector across Europe).
MICHAEL doesn't reach to item level, just collection descriptions. It also relates to collection descriptions in TEL.

Why are service registries needed?
Map of where content is and how it is managed.
Information Environment Service Registry
Machine to machine services; will know what schemas and terminologies have been used. Interoperability protocols.
(Translated subject terminology and screen material into Welsh.)

EDLNet project. Interoperability Working Group.
MinervaEC - the Minerva technical guidleines are being revised/updated. The previous guidelines were downloaded 60,000 times in 9 languages - this indicates the appetite for guidelines.

Slide 14 shows the path from institutional databases to national or theme/topic-based portals , from there into the EDL. [The metadata storage diagram on slide 15 is what's currently being built, slide 14 is a year old.]

It will support RDF triples. It will offer simple, advanced and faceted search [faceted search as browsing].

APIs would provide the mechanism to enable many different uses of the metadata. The benefit is then in the underlying services, not just website. [But if we want APIs, we have to ask for them or they might not happen.]

How to promote your content in Europeana?
Create your content using open standards. If you are already using the Minerva technical standards, then you should be able to supply your metadata so it they can link into something that will go into Europeana.

You should use your existing metadata standards and prepare to map your data to domain-specific Dublic Core Application Profiles. [Does domain specific mean there won't be one schema for museums, libraries, and archives; but possibly schemas for each? A really usable schema for museum data is the other thing we need to make APIs the truly useful tool they could be, even if different types of museums have slightly different requirements from a schema.]

Terminologies - prepare to take advantage of the semantic web. Publish terminologies and thesauri using SKOS - it's machine readable, can be used by search engines. [Using computers to match ontologies? Semweb FTW! Sorry, got a bit excited.]

Register your content and services with existing registries like TEL and MICHAEL.

All EU member states must: increase digitisation, tackle access, sort IPR, enable preservation.

Practicalities: in the UK the People's Network Discover Service (PNDS) currently harvests 500,000 digital objects. All MLA funded activity requires participation. Other projects, like Exploring 20th Century London, are using the PNDS infrastructure. The PNDS will contain an estimated 4 million digital objects by [the end of] 2008. It will be integrated into Culture 24 and the Collections Trust Subject Specialist Networks; part of same national infrastructure.

eContentPlus and EDLocal - support for institutions to get metadata into PNDS.

Timetable (slide 20): May 23, project conference launch [ask for information if you want to have your say]; June 4th, launch of Due Diligence Guidelines on Orphan Works [which will be useful for recent discussions about copyright and the cultural heritage sector].

23rd, 24th June - Europeana initial prototype reviewed - call for volunteers?
It's important to have museums people at the conference in order to represent museum-specific requirements, including the need for an API. It might be possible to fund museum people to get there.

November 2008: high profile launch.

After May 23rd David will be on the other side of the fence, and his question will be 'how can I get my content into the PNDS, Culture 24, Europeana?'.

Questions
Mike: is the API a must? David: it is for him, for the project managers it might be a maybe. Mike: without an API it will die a death.

Andrew: thanks to David for his work at the MLA (and the MCG). From May 24th [after David leaves], how does the MLA support this work? David: expecting announcment would have been made but as they haven't yet it's difficult to answer that.

Me: how can we as museums advocate or evangelise about the need for an API? David: go to the conference, represent views of institutions.

This session ended with thanks from Debbie and a round of applause for David's contributions to the Museums Computer Group.

Wednesday, 23 April 2008

Notes from 'The Welsh Dimension' at MCG's Spring Conference

There are my notes from 'The Welsh Dimension: Issues facing the museum sector in Wales' by Lesley-Ann Kerr at the MCG Spring meeting. Lesley-Ann's slides are online. Her presentation was about the issues facing the sector in Wales and the challenges of working in a small country. It helped contextualise some of the later papers, particularly the importance of bilingual sites and online collections.

There's some background to my notes about the conference in a previous post. Any of my comments are in [square brackets] below.

Background to Welsh context:
The landscape: literal and metaphorical; lots of Wales is rural, with a dispersed population, and poor south-north transport infrastructure - this has practical implications. There are variations in the coverage of local authority areas.

Online collections are important for providing access to key collections online and background about them for people in remote areas.

Language: there's a Welsh Language Act; they are required to offer bilingual services, some people use Welsh as first language. 25% of population is Welsh speaking.

This presents challenges in: translation costs; display (twice the text; do you display both on one page or on separate pages online/on screen, and how do you deal with text on panels nicely?); online databases (e.g. standard terminologies are difficult enough in one language); support for Welsh character set (e.g. requires Unicode); characters can be corrupted in document conversion; all planning must take bilingual challenges into account; maintenance issues - content must be in both languages, Welsh language versions can get out of step.

Use tools to translate searches; pick up keywords in both languages and present results appropriately. Switching between languages must be easy, particularly for learners when they reach limits of their Welsh.

Cultural identity: it's about more than just language; there's a sense of place. Socially, the question 'where are you from' comes before the question 'what do you do'. Providing services to a dispersed population can help provide cohesive sense of identity. The role of online collections is again important.

"What are we doing"
'Spotlight on museums' - a comprehensive survey.

'Quantifying Diversity' across museums, libraries and archives; equality strands around collections, staff, users.

Museum strategy for Wales - it comes from and is for museums - government policy will come from that [great! that seems like the right way around]. It provides a strategic way forward for museums in Wales but is about evolution rather then revolution.

One Wales (coalition agreement in government) includes commitment for an online People's Collection. They'll work with existing partners; will be using some cutting edge tech, some social networking stuff; looking not just at organisations with content and at developing new content but at involving users too. They'll be looking at Canadian and Australian sites [what is it with countries with dispersed populations and good collective digitisation programs?].

Notes from 'New Media Interpretation in the National Waterfront Museum' at MCG's Spring Conference

These are my notes from the presentation 'New Media Interpretation in the National Waterfront Museum' by Steph Mastoris at the MCG Spring Conference. There's some background to my notes about the conference in a previous post. Any comments in [square brackets] are mine.

I've put up lots of photos and some video from lunchtime tour of the interactives at the NWM.

Some background about the National Waterfront Museum (NWM):
The aim of the museum is to talk about the industrialisation of Wales. Its precursors were the Welsh Industrial Maritime Museum and the Swansea Maritime and Industrial Museum - in some ways these were unsuccessful museums.

The focus is on the human experience of industrialisation rather than the technology; it's a celebration of the impact rather than the technology itself.

Interpretation was intended to be delivered through new media right from the start. Objects are jumping off points for interpretation.

Displays are in zones; but using open-ended concepts rather than themes or chronological order. They are kaleidoscopic rather than comprehensive. Both a criticism and a strength of approach is that you end up with a fragmented view of a subject, though you can pick it up throughout the gallery. It's not for specialists, but for population who've never given industrialisation a second thought. It's audience-led, and not afraid to be populist.

Types of new media in the National Waterfront Museum:
The new media ranges from traditional looping films with personal testimony or oral history to audience-initiated stuff. Way-finders have visitor-activated mapping and constantly changing displays, and there are completely visitor-centred (activated?) displays. The 'People' interactive displays move from a map to a digital reconstruction of town, to a digital reconstruction of a house, then to embedded images of artefacts (also visible in the cases) then optionally onto detail that contextualises the artefacts. Visitors can manipulate stuff on screen, initiate oral histories and move onto items that are in collections but not on display.

Visitor reactions, 30 months into the project:
The response has been 'amazingly good'; they've exceeded their targets for visitor numbers in both the first and second years (and avoided the second year slump). The museum was designed around free access, with three entrances that encourage people to pop in and out. He suspects they are getting lots of short-term visitation as well as a lot of the 'classic one-and-a half hour' visits. It helps that they have an extremely active and community focused events program - their aim is to 'turn the main hall into the village hall of Swansea'.

Their visitors mirror the Welsh population in terms of age; not as much on social grade but they're doing better than other Welsh museums. They're fulfilling their populist agenda as best as they can.

The digital divide by age starts to become apparent when you look at 'enjoyment'. Older groups are having difficulties with something; the interactive computer parts are least popular for 55+ group. they like the traditional galleries more. They're addressing this with proactive gallery staff and by adding value to other aspects. Other museums thinking about interactives have to address this as the sector moves towards a new media approach.

The delights and problems of being a new media museum:
Buildings: you will need to pay attention to the critical path of power, plant and equipment (all it takes is one thing to go wrong in the chain - e.g. something ('some plastic bags, fifteen condoms and a dead dog') blocking the water supply that cools the building); physical access for servicing IT components (e.g. changing bulbs in projectors); the effect of the building design on new media display performance (e.g. sunlight on screens or projections).

Costs: energy costs, consumables (e.g. projector bulbs are a huge cost per year), support contracts, product renewal. They spend nearly £30,000 a year on projector bulbs!

People: technical team (restructure the organisation around a dynamic highly-skilled technical team); contractors; gallery authors (act as interpretative consultants and mediate between curators and designers and audiences).

Attitudes: display down-time (5% of displays are down at any one time; it's a moving target, just one of those things. Technology is fragile - it's a big change from when museums only had static cases); staff flexibility and the creativity to deal with these new challenges; corporate perceptions of wealth (lots of money coming in but it's all being spent), managing expectations (people think IT can do anything easily).

Display renewals [slide]:
Inter-relationship between collections and display, conservation. 'sacrificial artefacts' - not accessioned.
You can have a small impact from large expenditure.
Inter-referenced displays - what if a way finder points visitors to something that isn't currently on display?
How do you maintain technical cutting edge when things designed in 2002/03 are still on display; new technologies are fragile, become more robust in later versions. But as more common also more boring.
What is the next big thing? They're planning time to look at what's upcoming.

Notes from 'Catch the Wind: Digital Preservation and the Real World' at MCG's Spring Conference

These are my notes from Nick Poole's presentation 'Catch the Wind: Digital Preservation and the Real World' at the MCG Spring Conference. There's some background to my notes about the conference in a previous post. If I've made any comments below they're in [square brackets].

Nick's slides for 'Catch the Wind: Digital Preservation and the Real World' are online.

The MDA is now the Collections Trust. Their belief is that "everybody everywhere should have the right to access and benefit from cultural collections". Their work includes standards, professional development and public programmes wherever collections are kept and cared for and they have a remit across collections management, including documentation, digitisation and digital preservation.

We need to think about capturing and preserving digital surrogates, etc, or we'll end up with a 'digital dark age'.

We need a convergence of standards and practice in museums, libraries and archives, and to develop a community of professional practice.

Nick was interested to know if whether any museums are actively doing digital preservation. It turns out lots have some elements of digital preservation but it's not deeply embedded in the organisation. Nick sent a question to the Museums Computer Group (MCG) list: see the list archives for December 2007, or slide 6.

If you're not doing digital preservation, why not? And how do you decide whether and what is worth preserving? How do you preserve pieces of information or digital assets in their context needed for them to make sense?

Today is partly about the results of the enquiry begun with that email.

We know what we should be doing [slide 9, CHIN slide on workflow for 'Digital Preservation for Museums'.]

We know why we should be doing it:
The preservation and re-use of digital data and information forms both the cornerstone of future economic growth and development, and the foundation
for the future of memory.
From "Changing Trains at Wigan: Digital Preservation and the Future of Scholarship" by Seamus Ross - the 'common-sense bible about digital preservation'.

And there are lots of programs and diagrams (slides 11 - 15).

So if we know why and how we should be doing it, why aren't we doing it?

It's not necessarily about technology or money - is it about the culture in museums?
There's no funding imperative; project-funded digitisation seldom provides for (or requires) the kind of long-term embedded work that digital preservation requires.

It depends on the integration of workflows and systems which is still rare in museums. Some digital preservation principles fit more intuitively with an archival point of view than an object/artefact point of view.

Is it possibly also because museums aren't part of the scholarly/academic publishing loop which is giving rise to large scale digital preservation initiatives? e.g. Open Content Alliance.

We also don't have an expectation about the retrievability of non-object museum information that we do about collection information. [Too true, it doesn't seem to be valued the same way.]

We should learn from libraries and archives. We could mandate 'good enough' standards so digital assets can be migrated into stable environments in the future. There's so much going on that we'll never be able to draw a line in the sand and say 'standards happen now'. We need to tweak the way we work now, not introduce a whole new project.

A proposed national solution: could we aggregate 'just enough' metadata at a central point and preserve it there? But would organisations become disenfranchised from their own information, lose expertise in the curatorship of digital content, and would it blur the distinction between active and dormant records?

If not a national solution, then it must be local: but would it actually happen without statute, obligation or funding? Possibly through networks of people who support each other in digitisation work, but there are economic issues in developing infrastructure and expertise.

Museums seem oddly distant from current initiatives (e.g. Digital Preservation Coalition, Digital Curation Centre), and lack methodologies and tools that are specific to museum information. Do we need to develop collective approaches for digital preservation?

He hasn't got answers, just more questions.

We must start finding answers or the value of what we're doing right now will be lost in ten years time.

Questions
Mike: there was a slide 'is this stuff worth preserving' - but that question wasn't answered - is there lots of stuff we should and can just chuck away? Nick: the archival world view is more like that.

Alan [?]: born digital stuff like websites is difficult to 'index and scope'. The V&A website is divorced from libraries and archives - internal databases don't link to website [to capture non-collections records?]. What are the units of information or assets within a website? It's impossible to define boundaries and therefore to catalogue and preserve... How do we capture this content? Nick: web archiving solutions are already out there but do museums have the money for it?

John: to what extent could digital repositories be out-sourced? Nick: look at examples like the Archaeology Data Service. But for whatever reason, we're not following those models.

David: preservation was in NOF Digitise in business plan but ... didn't happen. He doesn't think archives are ahead in preservation services. Museums use of collections management systems is different to academia using repositories - there's an interesting distinction between long-term archiving and day to day work.

Ian [?] - re-run what we've done with [digitising] object collections but think about information collections too [?]. Nick: there's a development path there in existing CollMS, possibly with hosted CollMS, We don't need entirely new systems, we already have digital asset management systems (DAMS), web software, CollMS.

[This reminds me about recent discussions we've had internally about putting older object captions and information records on our OAI repository - this might be a step towards a 'good enough' step towards digital preservation.]

Monday, 21 April 2008

Notes from 'User-Generated Content' session at MW2008

These are my notes from the User-generated content session at Museums and the Web, Montreal, 2008. All mistakes are mine, any corrections are welcome, my comments are in [square brackets] below.

The papers presented were The Art of Storytelling: Enriching Art Museum Exhibits and Education through visitor narratives by Matthew Fisher, Alexandra Sastre, Beth Twiss-Garrity; The Living Museum: Supporting the Creation of Quality User-Generated Content by Allison Farber, Paul Radensky and Getting 'In Your Face': Strategies for Encouraging Creativity, Engagement and Investment When the Museum is Offline by Martin Lajoie, Gillian McIntyre, Ian Rubenzahl, Colin Wiginton.

The "Art of Storytelling" project at the Delaware Art Museum.
The paper covered visitor-contributed content (VCC), the key factors to success, and the motivations behind allowing visitors to contribute. The paper goes into more of the theoretical foundations.

The key findings of the Art of Storytelling project evaluation:

What's the value to the visitor who contributes? It engages visitors in thinking critically and creatively about and in response to art.
What's the value to the museum? They get feedback and to engage new audiences
What's the value to the other visitors? There is some confusion, but also it can be inspiring, enriching, and encourage others to participate.

It's not appropriate or appealing for all collections, but it's hard to predict which in advance.

[When looking at models of participation:] Allow the motivated to contribute, allow the rest to benefit; don't penalise for non-participation.

Simplicity - remove barriers to entry (engage first, login last if at all).

Promote in traditional and non traditional venues; have a clear invitation to participate.

Motivating factors included curatorial encouragement (have curators there encouraging people), juried selections, stipends for selected storytellers.

What's in it for the audience? They get 15 minutes next to an Andy Warhol. Establish an audience for your visitor-contributors - affiliation with an institution is valued, even if comes with caveats. [Theirs is an art museum - does that work for all audiences, or for all types of content?]

Don't try and create communities beyond the your collections (because others with bigger budgets are after the same eyeballs).

They gave lots of examples from a summary of evaluation of other projects (but the slides were hard to see):
Isabella Stewart Gardner Museum in Boston - thinking through art. Active looking skills.
The Wolfsonian Institute in Miami - Artful Citizenship Project.
Guggenheim - Learning Through Art.
Aldrich Contemporary Art Museum - Round by Halsey Burgund.
Denver Art museum - Frederic Remington exhibit, Hamilton building programming.
Philadelphia area art museums; art, literacy, museums.

They then provided some background information about the Art of Storytelling at the Delaware Art Museum - story telling kiosks in the museum.

Their 'a-ha' moment was realising that the experience was more transformational for the original story tellers (i.e. the adults) than the audience. Transcend traditional ideas re lack of authority.

In conclusion, visitor-contributed content programs are valuable to the original contributors and museums get valuable insight into audiences, but it's an open question as to whether they're valuable for other visitors.

The Museum of Jewish Heritage, supporting the creation of quality user-generated content in the "Living Museum".
The project started as outreach project for Jewish communities without local museum.

The process: students visit a local museum, learn a bit about how museums are structured in terms of display of objects and organisation. The students choose an artefact to represent Jewish heritage at home, then write labels at school. They then organise artefacts into galleries, create gallery title and text panels, then create in-school and online exhibitions. Students research artefacts and tell their own stories. They include measurements because you can't get a sense of scale on the internet. One piece of text tells story of artefact and the other tells significance of artefact to family.

[They aren't promoting it until they feel the site is ready - a familiar story!]

They run seminars with teachers to help them submit quality content, but they still get objects unrelated to Jewish heritage, text with spelling and grammar errors, incomplete labels, unclear photos, factual errors.

The goals included: educational - a connection to Jewish heritage, parental involvements, museology, improve writing and research skills; institutional - high quality images so people can see what they're looking at; privacy of students; motivation of students.

Teachers say students are motivated to get stuff right because it's going online.

Unlike other user-generated content sites, the content is pre-reviewed, as the project has very specific educational goals. They control who creates content on the website (anyone can use but only teachers from Jewish schools can post content) and check that kids can't be identified. They are trying not to expose the kids and their personal stories to comment or modification by other users.

They are going to implement a spell checker.

How can they help users to contribute quality content? Convey expectations, consider needs of both kinds of users, offer support, concentrate on process and product of exhibition creation, review submitted content and offer recommendations.

"In Your Face", Art Gallery of Ontario
In Your Face: The People's Portrait Project.
While the Art Gallery of Ontario building was under construction, they were interested in rethinking the way the organisation worked, and how to keep people connected to the institution.

They got the idea from UK's National Portrait Gallery BP prize, but they didn't want a contest.

They advertised on back of the national paper for a few weeks as they had free space. They wrote a copyright statement that people had to sign, specified a size, and said they would hang every piece that came in. They received 17,000 portraits.

First they got lots of entries from rural areas, then the rest of Canada and the world. There was an extraordinary variety in portraits, and also in the parcels. They also arrived with stories.

It was a one way thing - people knew they weren't going to get the portraits back. It mattered to them to get them exhibited in the AGO.

There was more diversity in portraits and in the people who came to see them than usually seen in the gallery. People paid money to see the portraits as they were 'after the gate'. It was lots of work for staff on top of their normal job cos it turned out to be huge, but gave them (the staff) energy. It also made their audiences real for the staff and helped make the institution inclusive.

Contemporary artists from their collection also sent in portraits, but their names weren't shown so it was all egalitarian.

They also created a Flickr group (but weren't able to get that projected in the gallery). It now has 10,000 portraits in it.

They had a parallel project - Collection X. It was an online project where visitors could make their own exhibition. Collect, connect and create. 'Open source museum' [- the online paper goes into more detail, including the use of RSS.]

Partly [?] as a result of the project, guiding principles developed for institution include relevance, responsiveness, creativity, transparency, diversity and forum.

Questions asked:
How do you balance museum's agenda with visitor expectations? Is it possible to assert control and foster programming that is open-ended? How do we think about expertise, quality and standards? How do we integrate and manage creativity in ways that are dynamic and long-term? Is curatorial expertise or audience experience paramount? Some things curators are uneasy about. Dynamism also means some volatility - how does that work?

Lessons learned:
Take risks, experiment and be willing to make mistakes
Museums can function as catalysts for creativity [my emphasis, this was the meme that ran through the whole session, for me]
A critical mass of creativity asserts its own kind of aesthetic
There is value in integrating user-generated content that is actual as well as virtual
Museum and the public can function as producers and consumers of culture to create a shared sense of ownership
The public will be invested if programming is authentic and they feel respected.

Sunday, 20 April 2008

Crowdsourcing metadata cleaning?

If you're interested in another perspective on dealing with user-generated tags or metadata, this blog post from last.fm, Fingerprinting and Metadata Progress Report talks about how they're trying to create 'order from chaos':
So far our fingerprint server identified 23 million unique tracks, from the 650 million fingerprint requests you’ve thrown at it. Who knows how many unique tracks there are out there.. We have a couple of hundred million tracks based on spelling alone – but not all of them are spelt correctly.

They have some interesting issues to deal with in cleaning up their (i.e. your data, if you're a last.fm user) data, especially when 'the most popular spelling is not necessarily the correct one'. And what about bands that change their name (but are essentially the same band) or line-up (are they still the same band?) - when do you decide to create a new identifier?

They're letting users who are logged in vote on potential corrections to an artist name, effectively testing crowdsourcing metadata corrections as well as the original data creation process. This model could work for museums - depending on the collection, some museums already get a lot of corrections when parts of their collections are published online. What would happen if we made that process transparent?

Saturday, 19 April 2008

Notes from 'Object-Orientated Democracies: Contradictions, Challenges And Opportunities' in 'Theoretical Frameworks' session, MW2008

These are my notes from the first paper, 'Object-Orientated Democracies: Contradictions, Challenges And Opportunities' in the Theoretical Frameworks session chaired by Darren Peacock at Museums and the Web 2008. I'll post the others later because the 'real world' is calling me to a 30th now.

I didn't blog these at the time because I wanted to read the papers properly before talking about them. I probably still need a bit longer to digest them, but the longer I leave it the more vague my memory will get and the less likely I am to revisit the papers, so please excuse (and contact me to correct!) any mistakes or misinterpretations. I'm not going to summarise the papers because you can go read them for yourself at the links below (one of the truly fantastic things about the Museums and the Web conferences, IMO), I'm just pulling out the bits that pinged in my brain for whatever reason. My comments on what was said are in [square brackets] below.

The papers were Object-centred democracies: contradictions, challenges and opportunities by Fiona Cameron, Who has the responsibility for saying what we see? mashing up Museum and Visitor voices, on-site and online by Peter Samis and The API as Curator by Aaron Straup Cope.

Darren introduced the session theme as 'the interplay between theory and practice'.

Fiona Cameron, Object-orientated democracies.

Museums use currently collections to produce stable, ordered, certain meanings. Curators are the gateway to a qualified interpretation of the object. [Classification and ordering as a wish-fulfilment exercise in 'objective', scientific recording, regardless of social or cultural context?]

However, the 'networked' (online, digital?) object overturns hierarchical museum classifications and closed museum-specific interpretive paradigms.

Online objects taking 'active role in social networks and political agendas'. [Objects re-appropriated in role as cultural signifiers by the communities they came from - cool!]

'Heritage significance is where the museum meets pop culture.'

Collection information becomes fluid when released into network, flow, subject to interactions with other resources and ideas.

From the paper: "Clearly, the more technology facilitates a networked social structure and individual cultural expression, as seen most recently with Web 2.0, the more difficult it becomes for museums to produce universal or consensual meanings for their collections."

[Why would museums want to (claim to) produce universal meanings anyway? One of the exciting possibilities of linking from each of our online objects to its instance in various museum projects is the potential to expose the multiplicity of interpretations and narrative contexts produced around any single object, even within the same museum. Also, projects like 'Reassessing What We Collect' are an acknowledgement that a 'universal' reading is in fact problematic.]

Bruno La Tour: object-orientated democracies. "For too long, objects have been wrongly portrayed as matters of fact."

Objects as mediators in assertion of associations, not just cultural symbols. How are competing readings inscribed in collections documentation context?

Collections wikis - how interactions between museum and public culture might inform new collection spaces.

Test cases for 'Reconceptualising Heritage Collections' - politically charged objects - coin and wedding dress. Wiki and real time discussion with curators, Palestinian Australians, Jewish readings of the same objects - many different readings.

Placing objects in open/public wiki was seen as problematic - assault on Palestinian culture. Role of museums in this... protection, 'apolitical gatekeeper', governance?

Collections as complex systems. [Complexity as problem to be smoothed out in recording.]

Objects derive meaning and significance from a large number of elements, multi/inter/disciplinary or from outside the museum walls. [Too much on that slide to read!]

Curators as expert groups within proposed systems; group boundaries are permeable. Static museum categories become more ambiguous as objects are interpreted in unexpected, interesting ways. Role in mapping social world around a collections item. Equilibrium vs chaos?

"Objects are able to perform at a higher level of complexity."

Issues re: museum authority and expertise, tensions between hierarchical structures and flexible networks, sustainable documentation practice, manage complexity.

[I think one of the reasons I liked this so much on a personal level is that it has a lot of parallels to the thinking I had to do about recording structures for post-processual archaeology at Çatalhöyük Archaeological Project - relational archaeological databases as traditionally conceived don't support the recording of ambiguity, uncertainty, plurality, multiplicity or of interpretative context.

I also like the sense of possibilities in a system that at first might seem to undermine curatorial or organisational authority - "Objects are able to perform at a higher level of complexity". The role of museums, and the ways curators work, might change, but both museums and curators are still valued.]

Nielson on 'should your website have concise or in-depth content?'

Long pages with all the text, or shorter pages with links to extended texts - this question often comes up in discussions about our websites. It's the kind of question that can be difficult to answer by looking at the stats for existing sites because raw numbers mask all kinds of factors, and so far we haven't had the time or resources to explore this with our different audiences.

In Long vs. Short Articles as Content Strategy Jakob Nielsen says:
  • If you want many readers, focus on short and scannable content. This is a good strategy for advertising-driven sites or sites that sell impulse buys.
  • If you want people who really need a solution, focus on comprehensive coverage. This is a good strategy if you sell highly targeted solutions to complicated problems.
...

But the very best content strategy is one that mirrors the users' mixed diet. There's no reason to limit yourself to only one content type. It's possible to have short overviews for the majority of users and to supplement them with in-depth coverage and white papers for those few users who need to know more.

Of course, the two user types are often the same person — the one who's usually in a hurry, but is sometimes in thorough-research mode. In fact, our studies of B2B users show that business users often aren't very familiar with the complex products or services they're buying and need simple overviews to orient themselves before they begin more in-depth research.

Hypertext to the Rescue
On the Web, you can offer both short and long treatments within a single hyperspace. Start with overviews and short, simplified pages. Then link to long, in-depth coverage on other pages.

With this approach, you can serve both types of users (or the same user in different stages of the buying process).

The more value you offer users each minute they're on your site, the more likely they are to use your site and the longer they're likely to stay. This is why it's so important to optimize your content strategy for your users' needs.


So how do we adapt commercial models for a cultural heritage context? Could business-to-business users who start by familiarising or orienting themselves before beginning more in-depth research be analogous to the 'meaning making modes' for museum visitors - browsers and followers, searchers or researchers - identified by consultants Morris, Hargreaves, McIntyre?

Is a 'read more' link on shorter pages helpful or disruptive of the visitors' experience? Can the shorter text be written to suit browsers and followers and the 'read more' link crafted to tempt the searchers?

I wish I could give the answer in the next paragraph, but I don't know it myself.

Museums and Clayton's audience participation

A comment Seb left on Nate's blog post about "master" metadata got me thinking about cognitive dissonance and whether museums who say they're open to public participation and content really act as if they are. Are we providing a Clayton's call for audience participation?

If what you do - raise the barrier to participation so high that hardly anyone is going to bother commenting or tagging - speaks louder than what you say - 'sure, we'd love to hear what you have to say' - which one do you think wins?

To pick an example I've seen recently (and this is not meant to be a criticism of them or their team because I have no idea what the reasons were) the London Transport Museum have put 'all Museum objects and stories on display in the new Museum' on their collections website, which is fantastic. If you look at a collection item, the page says, "Share a story with us - comment on this image", which sounds really open and inviting.

But
, if you want to comment, they ask for a lot of information about you first - check this random example.

So, ok. There are lots of possible reasons for this. UK museums have to deal with the Data Protection Act, which might complicate things, and their interpretation of the DPA might mean they ask for more information rather than less and add that scary tick box.

Or maybe they think the requirement to give this information won't deter their audience. I'd imagine that London Transport Museum's specialist audiences won't be put off by a registration form - some of their users are literally trainspotters and at risk of believing a stereotype, if they can bear the kind of weather that requires anoraks, they're probably not put off by a form.

Or maybe they're trying to control spam (though email addresses are no barrier to spam, and it's easy to use Akismet or moderation to trap spam); or maybe it's a halfway house between letting go and keeping control; or maybe they're tweaking the form in response to usage and will lower the barriers if they're not getting many comments.

Or maybe it's because the user-generated content captured this way goes directly into their collection management system and they want to record some idea of the provenance of the data. From a post to the UK Museums Computer Group list:
We have just launched the London Transport Online Museum. Users can view
every object, gallery and label text on display in our new museum in Covent Garden.

Following on from the current discussion thread we have incorporated into this new site, the facility for users to leave us memories / stories on all objects on display. Rather than a Wiki submission these stories are made directly on the website and will be fed back into our collection management system. These submissions can be viewed by all users as soon as they have passed through moderation process.

We will closely monitor how many responses we get and feedback to the group.

Please have a look, and maybe even leave us a memory?
[My emphasis in bold]

Moving on from the example of the London Transport Museum...

Whether the gap between their stated intentions and the apparent barriers to accepting user-generated content is the result of internal ambivalence about or resistance to user-generated content, concern about spam or 'bad data', or a belief that their specialist audiences will persist despite the barriers doesn't really make a difference; ultimately the intentionality matters less than the effect.

By raising the barrier to participation, aren't they ensuring that the casual audience remains exactly that - interested, but not fully engaged?

And as Seb pointed out, "Remembering that even tagging on the PHM collection - 15million views in 2007, 5 thousand tags . . . - and that is without requiring ANY form of login."

It also reminds me of what Peter Samis said at Museums and the Web in Montreal about engaging with museum visitors digitally: "We opened the door to let visitors in... then we left the room".

(If you're curious, the title is a reference to an Australian saying: Clayton's was "the drink you have when you're not having a drink", as as Wikipedia has it 'a compromise which satisfies no-one'. 'Ersatz' might be another word for it.)

Explaining the semantic web: by analogy and by example

Explaining by analogy: Miko Coffey summarises the semantic web as:
  • Web 1.0 is like buying a can of Campbell's Soup
  • Web 2.0 is like making homemade soup and inviting your soup-loving friends over
  • The semantic web is like having a dinner party, knowing that Tom is allergic to gluten, Sally is away til next Thursday and Bob is vegetarian.
And she's got a great image in the same post to help explain it.

To extend the analogy, it's also as if the semantic web could understand that when your American aunt's soup recipe says 'cilantro', you'd look for 'coriander' in shops in Australia or the UK.

Explaining by doing: this review 'Why I Migrated Over to Twine (And Other Social Services Bit the Dust)' of Twine gives lots of great examples of how semantic web stuff can help us:

So for example when Stanley Kubrick is mentioned in the bookmarklet fields, or in the document you upload, or in the email you send into Twine — the system will analyze and identify him as a person (not as a mere keyword). This is called entity extraction and is applied to all text on Twine.

Under the hood, a person is defined in a larger ontology in relation to other things. Here’s an example of a very small portion of my own graph within Twine:
Hrafn Th. Thorissons RDF graph in Twine


Some may not find the point of this clear. So to explain: Just as HTML enables computers to display data — this extra semantic information markup (RDF, OWL, etc.) enables computers to understand what the data is they’re displaying. And moreover, to understand what things are in relation to other things.

Example Search
For an example, when we search for “Stanley Kubrick” on regular search engines, the words “Stanley” and “Kubrick” are usually regarded as mere keywords: a series of letters that the search engine then tries to find pages with those series of letters. But in the world of semantic web, the engines know “Stanley Kubrick” is a person. This results in a lot less irrelevant items from the search’s results....

If you weren’t already aware, the systems I just described above are the basic semantic web concept: Encapsulating data in a new layer of machine processable information to help us search, find and organize the overwhelming and ever-growing sea of pictures, videos, text and whatever else we’re creating.


I think these are both useful when explaining the benefits of the semantic web to non-geeks and may help overcome some of the fear of the unknown (or fear of investment in the pointless buzzword) we might encounter. If we believe in the semantic web, it's up to us to explain it properly to other people it's going to effect.

I also discovered a good post by Mike on the 'Innovation Manifesto'.

Friday, 18 April 2008

It's a wonderful, wonderful web

First, the news that Google are starting to crawl the deep or invisible web via html forms on a sample of 'high quality' sites (via The Walker Art Center's New Media Initiatives blog):
This experiment is part of Google's broader effort to increase its coverage of the web. In fact, HTML forms have long been thought to be the gateway to large volumes of data beyond the normal scope of search engines. The terms Deep Web, Hidden Web, or Invisible Web have been used collectively to refer to such content that has so far been invisible to search engine users. By crawling using HTML forms (and abiding by robots.txt), we are able to lead search engine users to documents that would otherwise not be easily found in search engines, and provide webmasters and users alike with a better and more comprehensive search experience.
You're probably already well indexed if you have a browsable interface that leads to every single one of your collection records and images and whatever; but if you've got any content that was hidden behind a search form (and I know we have some in older sites), this could give it much greater visibility.

Secondly, Mike Ellis has done a sterling job synthesising some of the official, backchannel and informal conversations about the semantic web at MW2008 and adding his own perspective on his blog.

Talking about Flickr's 20 gazillion tags:

To take an example: at the individual tag level, the flaws of misspellings and inaccuracies are annoying and troublesome, but at a meta level these inaccuracies are ironed out; flattened by sheer mass: a kind of bell-curve peak of correctness. At the same time, inferences can be drawn from the connections and proximity of tags. If the word “cat” appears consistently - in millions and millions of data items - next to the word “kitten” then the system can start to make some assumptions about the related meaning of those words. Out of the apparent chaos of the folksonomy - the lack of formal vocabulary, the anti-taxonomy - comes a higher-level order. Seb put it the other way round by talking about the “shanty towns” of museum data: “examine order and you see chaos”.

The total “value” of the data, in other words, really is way, way greater than the sum of the parts.

So far, so ace. We've been excited about using the implicit links created between data as people consciously record information with tags, or unconsciously with their paths between data to create those 'small ontologies, loosely joined'; the possibilities of multilingual tagging, etc, before. Tags are cool.

But the applications of this could go further:
I got thinking about how this can all be applied to the Semantic Web. It increasingly strikes me that the distributed nature of the machine processable, API-accessible web carries many similar hallmarks. Each of those distributed systems - the Yahoo! Content Analysis API, the Google postcode lookup, Open Calais - are essentially dumb systems. But hook them together; start to patch the entire thing into a distributed framework, and things take on an entirely different complexion.
...
Here’s what I’m starting to gnaw at: maybe it’s here. Maybe if it quacks like a duck, walks like a duck (as per the recent Becta report by Emma Tonkin at UKOLN) then it really is a duck. Maybe the machine-processable web that we see in mashups, API’s, RSS, microformats - the so-called “lightweight” stuff that I’m forever writing about - maybe that’s all we need. Like the widely accepted notion of scale and we-ness in the social and tagged web, perhaps these dumb synapses when put together are enough to give us the collective intelligence - the Semantic Web - that we have talked and written about for so long.
I'd say those capital letters in 'Semantic Web' might scare some of the hardcore SW crowd, but that's ok, isn't it? Semantics (sorry) aside, we're all working towards the same goal - the machine-processable web.

And in the meantime, if we can put our data out there so others can tag it, and so that we're exposing our internal 'tags' (even if they have fancier names in our collections management systems), we're moving in the right direction.

(Now I've got Black's "Wonderful Life" stuck in my head, doh. Luckily it's the cover version without the cheesy synths).

Right, now I'm off to the Museum in Docklands to talk about MultiMimsy database extractions and repositories. Rock.

Thursday, 17 April 2008

Calling geeks in the UK with an interest in cultural heritage content/audiences

You might be interested in BathCamp - a bar camp in Bath on a Saturday (with overnight stay) in late August. This is an initial open call so head along to the website (BathCamp) and check it out. Ideally you would have an interest in cultural heritage content, audiences or applications, but we love the idea of getting fresh perspectives from a wide range of people so we don't expect that you would have worked with the cultural heritage sector (museums, galleries, libraries, archives, archaeology) before.

Wednesday, 16 April 2008

Questions from 'Beyond Single Repositories' at MW2008

I'm still working on getting my notes from Museums and the Web in Montreal online.

These are notes from the questions at the 'Beyond Single Repositories' session. This session was led by Ross Parry, and included the papers Learning from the People: Traditional Knowledge and Educational Standards by Daniel Elias and James Forrest and The Commons on Flickr: A Primer by George Oates.

This clashed with the User-Generated Content session that I felt I should see for work, but I managed to sneak in at the end of Ross's session. I expected this room to be packed, but it wasn't. I guess the ripples of user-generated content and Web 2.0-ish stuff are still spreading beyond the geeks, and the pebbles of single repositories and the semantic web have barely dropped into the pond for most people. As usual, all mistakes are mine - if you asked a question and I haven't named you or got your question wrong, drop me a line.

Quite a lot of the questions related to 'The Commons'.

There was a question about the difference between users who download and retain context of images, versus those who just download the image and lose all context, attribution, etc. George: Flickr considered putting the metadata into EXIF but it was problematic and wasn't robust enough to be useful.

Another question: how to link back to institution from Flickr? George: 'there's this great invention called the hyperlink'. And links can also go to picture libraries to buy prints.

[I need to check this but it could really help make the case for Commons in museums if that's the case. We might also be able to target different audiences with different requirements - e.g. commercial publications vs school assignments. I also need to check if Flickr URLs are permanent and stable.]

Seb Chan asked: how does business model of having images on Flickr co-exist with existing practices?

Flickr are cool with museums putting in content at different resolutions - it's up to institution to decide.

"It's so easy to do things the correct way" so please teach everyone to use CC licence stuff appropriately.

Issues are starting to be raised about revenue sharing models.

[I wonder if we could put in FOI requests to find out exactly how much revenue UK museums make from selling images compared to the overhead in servicing commercial picture libraries, and whether it varies by type of image or use. It'd be great if we could put some Museum of London/MoLAS images on Commons, particularly if we could use tagging to generate multilingual labels and re-assess images in terms of diversity - such an important issue for our London audiences; or to get more images/objects geo-located. I also wonder if there are any resourcing issues for moderation requirements, or do we just cope with whatever tags are added?]

Update: following the conference, Frankie Roberto started a discussion on the Museums Computer Group list under the subject 'copyright licensing and museums'. You have to be a member to post but a range of perspectives and expertise would really help move this discussion on.

Tuesday, 15 April 2008

Some feedback to MW2008 and other conferences

There's a thread on the Museums and the Web conference site asking for suggestions for MW2009. I was a bit zombie-like by the time I filled out the feedback form, so I'd added some more comments.

I'm posting them here because I think they apply to lots of conferences and these are things I'd like to see generally. It might look like a lot of comments but I'm probably inspired to write because overall the conference was so good.

There were suggestions to have Pecha Kucha style sessions for people to talk about their projects. I think that'd be really useful - people in the early stages of a project could get a range of feedback and suggestions from some of the best researchers and most experienced 'doers' around; and the vast majority of projects that will never be written up as big conference papers can still pass on a few valuable lessons in a few minutes. It'd also help build a pool of people who had some experience presenting.

I also suggested having afternoon versions of the Birds of a Feather breakfasts. I'm one of those people who's not at all sociable in the morning, but an afternoon session in a coffee shop or pub would be perfect. It'd also give you a way to meet people and maybe go on to dinner or drinks - it must be really difficult if you don't know anyone there and are a bit shy. I'd imagine you could find people who are interested in the same topics more easily this way because it offers a bit more structure than just drinks.

I don't know if there are any guidelines when writing papers but I'd like to suggest one - it's really useful when people talk about how their projects worked in their institutions/sector, as it helps everyone work out how to champion and implement similar ideas when they get back from the conference. Or maybe that's a thread for one of the museum geeks lists...

It would be really useful if each session listed the audience (managers, technologists, educators, etc) and the level of experience it was aimed at (e.g. absolute beginners, practitioners, people looking for a practical learning session) in the program. A lot of the papers did a really good job covering a range of potential audiences, but I might have skipped other sessions if I'd realised they were aimed at an introductory level.

Museums and the Web conferences are brilliant because they put the papers online, so this is a minor quibble, but it would be handy if the papers were available as pdf (or similar) downloads so I could load them onto my phone or laptop beforehand. That way I could follow them during the presentations if there isn't any network connectivity, or review them afterwards.

Finally, it would be so helpful if all presenters had to put their slides online somewhere, tagged with the conference tag and linked from the conference site. The one paper I've blogged about so far had their slides online, and it helped me immensely when writing up as I could check my notes against theirs. As more people blog about conferences, you might need tags for each session - a bit more overhead, but I'm sure you'd get great conversations between people who blogged about the same sessions and hopefully with presenters too.

How I do documentation: a column of bumph and a column of gold

All programmers hate documentation, right? But I've discovered a way to make it less painful and I'm posting in case it helps anyone else.

The first trick is to start documenting as soon as you start thinking about a project - well before you've written any code. I keep a running document of the work I've done, including the bits I'm about to try, information about links into other databases or applications, issues I need to think about or questions I need to ask someone, rude comments (I know, I look like such a nice girl), references, quick use cases, bits about functions, summary notes from meetings, etc.

Mostly I record by date, blog style. Doing it by date helps me link repository files, paper notes and emails with particular bits of work, which can otherwise be tricky if it's a while since you worked on a project or if you have lots of projects on the go. It's also handy if you need to record the time spent on different projects.

I just did it like this for a while, and it was ok, but I learnt the hard way that it takes a while to sort through it if I needed to send someone else some documentation. Then I made a conscious decision to separate the random musings from the decisions and notes on the productive bits of code.

So now my document has two columns. This first column is all the bumph described above - the stuff I'd need if I wanted to retrace my steps or remind myself why I ended up doing things a certain way. The second column records key decisions or final solutions. This is your column of gold.

This way I can quickly run down the items in the second column, organise it by area instead of by date and come up with some good documentation without much effort. And if I ever want to write up the whole project, I've got a record of the whole process in the column of bumph.

You could add a third column to record outstanding tasks or questions. I tend to mark these up with colour and un-colour them when they're done. It just depends how you like to work.

It's amazingly simple, but it works. I hope it might be useful for you too. Or if you have any better suggestions (or a better title for this post), I'd love to hear them.

Saturday, 12 April 2008

What Does Openness Mean to The Musum Community?

There's an almost-live report from Mike Ellis and Brian Kelly's "What Does Openness Mean to The Museum Community?" forum at the Museums and the Web conference yesterday at http://mw2008.wetpaint.com/page/report

It's a really important discussion and as it's a wiki I assume you can add comments. I am running late for a session but will sort out my notes later.

Friday, 11 April 2008

Notes from Advanced Web Development: software strategies for online applications at MW2008

These are my notes from the Advanced Web Development: software strategies for online applications workshop with Rob Stein, Charles Moad and Edward Bachta from the Indianapolis Museum of Art at Museums and the Web 2008 (MW2008) in Montreal. I don't know if they'll be useful for anyone else, but if you have any questions about my notes, let me know.

They had their slides online before the presentation, which was really helpful. [More of this sort of thing, please! Though I wish there was a way to view thumbnails of slides on slideshare so you can skip to particular slides.]

The workshop covered a lot of ground, and they did a pretty good job of pitching it at different levels of geekdom. Some of my notes will seem self-evident to different types of geeks or non-geeks but I've tried to include most of what they covered. I've put some of my own comments in [square brackets].

They started with the difference between web pages and web applications, and pointed out that people have been building applications for 30 years so build on existing stuff.

Last year's talk was about 'web 2.0' and the foundations of building solid software applications but since then APIs/SDKs have taken off. Developers should pick pieces that already work rather than building from the bottom up. The craft lies in knowing how to choose the components and how to integrate them.

There are still reasons to consider building your own APIs e.g. if you have unique information others are unlikely to support adequately, if you care about security of data, if you want to control the distribution of information, or if a guarantee of service is important (e.g. if vendors disappear).

Building APIs
They're using model driven development, using xmlschema or database as your model.

Object relational mappers provide object-oriented access to a database. Data model changes are picked up automatically and they're generally database-agnostic so you can swap out the back end. Object relational mappers include Ruby, Hibernate (also in .Net), Propel and SQLAlchemy.

IMA use Hibernate with EMu (their collections management system) and Propel. They've built an 'adaptive layer' for their collection that glues it all together.

Slide on Eclipse: 'rich client platform', not just an IDE. Supports nearly every language except .Net; is cross-platform.

Search
Use full-text indexes for good search functionality. They suggest Lucene (from apache.org) or Google gears. Lucene query types offer finer control than Google e.g. fielded searching [a huge draw for specialist collections searches], date range searching, sorting by any field, multiple index searching with merged results. Fast, low memory usage, extensible. Tools built on Lucene include Nutch (web crawler) and Solr - REST and SOAP API.

Bite size web components and suggestions for a web application toolkit
Harking back to the 'find good components' thing. Leverage someone else's work, and reduce dev/debugging costs - in their experience it produces fewer errors than writing their own stuff.

Storage - Amazon, Nirvanix, XDrive, Google, Box.net. Use Amazon S3 if accessed infrequently cos of free structure.

Video - YouTube, Revver, blip.tv also have developer interfaces. The IMA don't host any video on their website, it's all on YouTube.

Images - Flickr, Picasa. [But the picasa UI sucks so please don't inflict that on your users!]. Flickr support for REST, SOAP, JSON.

Compute (EC, Amazon web service) - Linux virtual machines. Custom disk images for specific requirements. Billable on use. See slides on costs for web hosting.

Authentication services - OpenID, OAuth.

Social computing
Consider social computing when developing your web applications - it's evolving rapidly and is uncertain. Facebook vs OpenSocial (might be the question today, but tomorrow?). Stick with the eyeballs and be ready to change. [Though the problem for museums thinking about social software applications remains - by the time most museums go through approval processes to get onto Facebook it'll be dead in the water. Another reason to have good programmers on staff and include content resources in online programs, so that teams can be more flexible while still working within the overall online strategy of their organisation.]

Developing on Facebook
Facebook API - REST-based API. Use their developer platform - simpler than original API calls. JSON simpler than XML responses. Facebook Query Language (FQL) reduces calls to API. Facebook Markup Language (FBML). HTML + Facebook specific features, inc security controls and interfaces features. [There's a pronoun tag with built-in 'they' if not sure of gender of person. Cute.] Lots more in their slides.

Widget frameworks
Widgets are the buzzword that hasn't quite taken off. The utility isn't quite there yet, so what are they used for? Players are Google, Netvibes (supports more platforms including Apple Mac dashboard, Yahoo, iGoogle, etc) but is Adobe AIR the widget killer? Flash-based runtime for desktop apps. e.g. twhirl. Run as background processes, and can access desktop files directly, clipboard, drag and drop. [I downloaded the AIR Google Analytics application during the session, it's a good example.]

Content management
The CMS is the container to put all the components together. A good CMS will let you integrate components into a new site with a minimum of effort. [Wouldn't that be nice?] Examples include Joomla, WordPress, Drupal, Plone.

There aren't slides for the next 'CMS tour' bit, but they gave some great examples.

Nature holds my camera: they tried visitor blogging with a terminal in gallery so people could ask questions.

They talked about the IMA dashboard. [I asked a vague question about whether there was a user-driven or organisational business case for it - turns out it was driven by their CEO's interest in transparency, e.g. in sharing how they invest monies, track stats and communicate with their visitors. It helps engender trust and loyalty e.g. for donors. Attendance drives corporate sponsorship so there was a business case. It's also good for tracking their performance against actual actions vs stated goals.]

The advantages of using a web application toolkit - theromansarecoming.com took $50,000 to build for a four month exhibition. It hit the goals but was expensive. [The demo looked really cool, it's a shame you don't seem to be able to access it online.]

Breaking the Mode was built using existing components on the technical side, but required the same content investment i.e. in-house resources as The Romans Are Coming. The communication issues were much better because it was built in-house - less of a requirement to explain to external developers, which had some effect on the cost [but the biggest saving was i re-usable component] - the site took 25 hours to build and IT staff costs were about $1000. [So, quite a saving there.]

They demonstrated 'athena', the IMA's intranet. It has file sharing and task management and is built on drupal, looks a bit like basecamp-lite but without licensing issues. "Everything you do in a museum is project-based" and their intranet is built to support that.

There was discussion about whether their intranet could be shared with other museums. Rob Stein is a firm believer in open source and thinks it's the best way to go for museum sector. They're willing to share the source code but don't have the facilities to support it. There's a possibility that they could partner with other institutions to combine to pay small vendors to support it.

[I could hear a sudden burst of keyboards clicking around me as the discussion went onto pooling resources to create and support open source applications for stuff museums need to do. Smaller museums (i.e. most of us, and most are much smaller than MoL) don't have the resources for bespoke software or support but if we all combined, we'd be a bigger market. Overall, it was a really good, grounded discussion about the realities and possibilities of open source development.]

Back to the slides...

Team Troubles
[It was absolutely brilliant to see a discussion of teamwork and collaboration issues in a technical session.]

Divide and conquer - allow team members to focus on area of expertise. Makes it easier to swap out content and themes.

They're using MVC - Model (data management), Controller (interaction logic), View (user internface). They had some good stuff on MVC and the web in their slides (around 77-79). They also discussed the role of non-technical team members.

Drupal boot camp
[This was a pretty convincing demo of getting started with Drupal and using the Content Construction Kit (CCK) to create custom content types e.g. work of art to publish content quickly, though I did wonder about how it integrated with ORMs that would automatically pick up an underlying data structure. Slide 103 showed recommended Drupal modules. It's definitely worth checking out if you're looking for a CMS. If you're on Windows, check out bitnami for installation.]

Client side development
"The customer is always right"
They talked about the DOM (document object model) and javascript for Web 2.0 coolness.
They recommended using Javascript toolkits - more object-orientated, solve cross-browser issues, rapid development. Slide 109 listed some Javascript toolkits and they also recommended Firebug.

Interface components
They should be re-usable, just like the server-side stuff. They should some suggestions like reCAPTCHA, image carousels and rating modules. Pick the tools with best community support and cross-platform support.

CSS boilerplates
Treat CSS like another software component of web design and standardise your CSS usage. Use structured naming for classes and divs in server-side content generation. Check out oswd.org for free templates.

XML in the real world
They demonstrated Global Origins (more on that and other goodness at www.ima-digital.org/special-projects) which uses XML driven content.

Questions and discussion
I asked about integration with legacy/existing systems. Their middleware component 'Mercury' binds their commercial packages and other applications together. e.g. collection management system extraction layer. [This could be a good formalised model for MoL, as we have to pull from a few different places and push out to lots more and it's all a bit ad hoc at the moment. I think we'll be having lots of good discussions about this very soon.]

Some discussion about putting pressure on vendors to open data models. It's a better economic model for them and for museums.

Their CEO is supportive of iteration (in the development process). The web team is cross-department, and they have new media content creators.

[I was curious about how iterative development and the possibility of making mistakes work with their brand but didn't want to ask too many questions]

They made the point that you have a bigger recruiting pool with open source software. [Recruiting geeks into museums has been a bit of a conference meme.]

They give away iPods for online surveys and get more responses that way, but you do have to be aware that people might only give polite answers to survey questions so pay close attention to any criticism.

The IMA say you should be able to justify the longevity of projects when experimenting. Measure your projects against your mission, and how they can implement your mission statement.

So, that's it! I hope I didn't misrepresent anything they said.