Amplified: Michael Joyce on the Amplify Podcast Preservation Tool
Amplified is a new blog series taking us behind the scenes at the Amplify Podcast Network to explore the different ways our team is reimagining the sound of scholarship. We are kicking off the series this month to learn more about how we’re putting our dreams of podcast preservation into practice. Amplify project manager Stacey Copeland is joined by Michael Joyce, our resident web and data Services Developer from the DHIL @ Simon Fraser University, to talk about the ongoing development of our open source podcast preservation tool. A new tool that will allow podcasters to archive their episodes with rich metadata, and to preserve their podcasts on their institutional repositories.
Stacey Copeland: Here at Amplify we’re dedicated to reimagining the sound of scholarship. As with all digital projects, podcasts need to be archived properly so that they don’t turn into digital dinosaurs in two or five or ten years when you stop updating the site or paying for your hosting service. We don’t just want to produce new, peer-reviewed podcasts -- we also want to make sure those podcasts are discoverable in the short-term and the long-term. Our goal is to make peer-reviewed podcasts as easy to find as journal articles. That’s why we’ve partnered with the Digital Humanities Innovation Lab at SFU, as well as WLU Press and WLU Library, to build a new tool that will allow podcasters to archive their episodes with rich metadata, and to preserve their podcasts on their institutional repositories. Hi, I’m Stacey Copeland, project manager of the Amplify Podcast Network, and to learn more about how we’re putting our dreams of podcast preservation into practice, I’m joined by Michael Joyce, our resident web and data Services Developer on the project.
Michael Joyce: My name is Michael Joyce. I'm the Web and data services developer for the Digital Humanities Innovation Lab at Simon Fraser University. My work is mostly in databases and data mining. I also do a little bit of development work for front end things, mostly through creating APIs, the systems that let programmes talk to other programmes. My relationship with Amplify is interesting. I'm very happy and proud to be a part of it. It's a great project and Hannah's [Hannah McGregor] a lot of fun to work with. I built the database that collects all of the metadata for the podcast episodes, and it will also create the export packages that get loaded into the digital preservation systems.
Stacey Copeland: Michael, can you tell me a bit more about the preservation tool, so from the basically big context of what you hope it will do and what we're hoping as a project it will do to more of the nitty gritty of working with it to design its operation for users like podcasters, archivists and librarians.
Michael Joyce: Sure. We're in the SFU library, we use Islandora for our digital preservation system. It's this large complex software package that collects data and ensures its integrity over time. It's constantly doing things like verifying checksums on the data and the metadata in the repository.
Michael Joyce: Checksums are a way of verifying the integrity of the content of a file. So it's a mathematical tool that looks at different data blocks and make sure that they have the right properties.
Stacey Copeland: What I didn’t ask Michael to explain was metadata but we can understand that here as: "data that provides information about other data”, in the case of podcasts it’s typically data like the name of the show, the contributors and other information that help describe the accompanying audio file.
Okay So we know Islandora is a free open-source software that allows institutions and their audiences to manage and discover digital assets. And in our case here at Amplify, that’s podcasts. But how do we collect the podcast files and metadata in a way that’s both user friendly and compatible with Islandora? That’s where our Amplify Podcast Preservation Tool comes in.
Okay, so we know Islandora is a free open-source software that allows institutions and their audiences to manage and discover digital assets. It’s working behind the scenes at libraries and other institutions, like the library here at Simon Fraser University, to preserve digital materials for library users to access. Whether that’s images, articles, reports, or in our case here at Amplify, podcasts. But how do we compile all our podcast material and metadata in a way that’s compatible with Islandora? That’s where our Amplify Podcast Preservation Tool comes in.
Michael Joyce: Amplify database project is written in PHP [programming language] using the symphony framework. It's a series of software packages that all work together to provide things like user accounts, database Connexions and queries, and a whole bunch of other really useful things.
It provides editing tools and front-end forms for people to upload podcast episodes and provide all of the metadata that we can collect about each episode. It also allows people to upload images to associate with each episode and transcriptions in PDF. So it stores that sort of temporarily until an entire season is uploaded and described, and then we can export the entire season. All is a single package to transfer that to Islandora, where it is then preserved.
Stacey Copeland: Having this bigger picture of the preservation tool specifically, what were some of the key questions like research questions that you and the team behind the preservation tool we're trying to answer?
Michael Joyce: We looked a lot at which metadata for each episode we needed to record and preserve, and then we had to compare that with what Islandora is capable of storing and preserving. And so. Transforming one into the other was a bit of a fun challenge.
Stacey Copeland: And on challenges, what were some of the key challenges that you've hit with the creation of the tool?
Michael Joyce: Our Islandora repository has, I think, nine million items in it at Simon Fraser university. Typically, these things that are preserved in Islandora are much simpler. Think of it like a single postcard. You have the front and back and a little bit of metadata. You don't have the audio. You don't also have a transcript. You don't have a whole bunch of additional things, right? But for a podcast episode, we have the audio file, we have one or more image files to go with it and a PDF transcription. That means it's really three objects sort of masquerading as a single object. These compound objects are complicated and storing them and presenting them to the user is also complicated. It's not something we do much in our repository.
Stacey Copeland: And so with this preservation tool, do you see? Do you see this kind of project being able to encourage other people to use Islandora in these more complicated, multifaceted ways?
Michael Joyce: I certainly hope so.
Stacey Copeland: And so, with that, what's next for the project at this point, I know you've been working really hard on getting it to an operations point. So at this point, would it be mostly design or are there still a few hiccups to work out?
Michael Joyce: There are still a few hiccups to work out. We're having a few issues with display of the metadata in Islandora and also in the way that these compound objects are just shown and navigated. Moving from the audio file to the image to the transcript and back to the audio file, for example, is maybe a little bit more complicated than it needs to be. We're also going to be adding each episode to the SFU Library catalogue to make them more discoverable. And so figuring out what the proper process is to do that is, it's going to take a little bit of work with some of the cataloguing specialists in the library.
Stacey Copeland: And so what do you see happening? Maybe what do you hope is going to happen with this project once it does get to a point where you can share it with other libraries or universities or researchers?
Michael Joyce: Well, I hope that there are other universities that end up using this project. It’s open source so other people can install it in their institutions and they can make whatever changes they deem necessary for their environment. One thing that I would really like to see is support for other preservation platforms like Content DM or HYDRA. The SFU Library is all about Islandora. They are similar platforms, but they have different requirements for the packages that they import and preserve.
Stacey Copeland: Great. So very concise. [laughs] So, it is good to hear a bit more about your aspect of the project.
Michael Joyce: Thanks. I appreciate that.
Stacey Copeland: Whether you’re a podcast academic or listener, we hope the importance of podcast preservation is an idea that sticks with you. Michael and our team members here at Amplify work are working diligently toward a fully functioning open source podcast preservation tool for any library or institution to use, but don’t wait! If you’ve got a podcast, now is the time to start organizing and backing up your files. Just think of it as doing a favour for your future self! It’ll be much easier to preserve your podcast in the long term via an institutional or disciplinary repository if you’re already in the habit of keeping your files consistently named and well-organized. Stay tuned for more episodes of Amplified, behind the scenes chats coming to you soon from our team here at Amplify Podcast Network.
Intro + Outro Music: Pxl Cray – Blue Dot Studios (2016)
Written and produced by: Stacey Copeland
Guest: Michael Joyce, DHIL @ SFU Web and data services developer