Media Supply Chain and Metadata
This series of articles looks at the What, Where and How’s of the Media Supply Chain, where we delve into the fundamentals of what exactly the media supply chain is, and how to go about building one.
A key topic in our first article focused on standardization, which is not unrelated to the topic that we’ll explore in this article: metadata.
Actually, in our eBook on the same topic, we mentioned metadata quite a lot – thirty times to be precise – but didn’t spend much time discussing exactly what that metadata is, why it’s important, creating and managing it and how to structure it in a media supply chain context.
Before we dive in, there is a good reason that metadata was mentioned so frequently in our eBook. It is the metadata that describes what your media assets are, and that description is the key to realizing the value of those media assets – you’ve got to know what you’ve got, before you can work out what to do with it.
Media metadata broadly falls into three categories:
Descriptive metadata includes any information which describes all or part of the assets that can be used later for identification and discovery. Descriptive video metadata is the most well-known type of metadata and is often described as the most robust because there are so many ways to describe an asset.
Structural metadata tells us how the asset is formed – what audio, video, and ancillary data tracks it is made up of, any chapters/segments that are within those tracks, and whether it’s related to other assets (i.e. is part of a series or set).
Administrative video metadata concerns the technical source of a digital resource and how it can be managed. It is the metadata that relates to rights and intellectual property by providing data and information about the owner, as well as where and how it is allowed to be used.
In this article we will mostly look at structural and descriptive metadata. We’ll go into a little more detail regarding administrative metadata, specifically rights and intellectual property management, in the next article.
What does Media Supply Chain mean, why do I need one, and how do I implement one? Get answers to your questions in our free eBook.
While a huge amount of variability is still prevalent between departments, facilities, and geographies, the metadata landscape today is a million miles from the “Wild West” of 15-20 years ago. Much of this standardization has come about indirectly through other developments and standardization efforts in other parts of the media lifecycle. For example, the consolidation in file formats used in media workflows — as discussed in the previous article, primarily now to MXF, and further constrained to the application specifications also mentioned — has resulted in a standard taxonomy for structural metadata. Similarly, as the need has grown to transmit certain metadata, such as captioning alongside video and audio, there have been further efforts to standardize this and other ancillary data.
Driven by the need to assure quality when exchanging files and the increased utilization of automated media analysis tools, another area that saw significant developments in standardization for descriptive metadata was QC (Quality Control) - again, largely thanks to organizations such as the DPP (Digital Production Partnership), IRT (Institut fuer Rundfunk Technik) and the EBU (European Broadcasting Union) QC project.
It’s not easy to remember, or in some cases imagine, a time when all the metadata we had about an asset depended solely on a human accurately capturing and documenting that data, even the most basic structural data such as duration, which we now have access to almost instantly and without thought. Automated media analysis has come a very long way in a relatively short period of time – even the basic, free and almost ubiquitous MediaInfo is only just 18 years old. MediaInfo gave us access to the structural metadata of many media files as well as some descriptive fields, but less than three years later, UK start-up Vqual was the first to hit the broadcast market with a file-based video and audio analyzer, Cerify. The biggest strength, but also a weakness of Cerify and similar tools at the time, was that it could (and often did, if configured incorrectly) produce data about every frame, macroblock and pixel in a file. Making use, or even making sense, of this much data often presented as much of a problem as automated QC could solve, but as media analysis tools have evolved, that data has become increasingly useful. Artificial Intelligence (AI), and machine learning (ML) have brought further innovation and even more data from video and audio analysis, yet again presenting challenges in managing and, crucially, capitalizing on this data.
The role of metadata in automating workflows has developed far beyond driving basic search and automation. Today, run-time decisions are made either during automated workflows based on existing metadata, or metadata generated or updated during workflow processes.
These decisions can be quite simple. For example, if the result of a media analysis shows that an ingested media asset is not in a “house format” a transcode step can be dynamically added into an ingest workflow. Or, as a more advanced example, it can be decided to “fast track” processing if the result of a context analysis of a transcription (perhaps from a separate speech-to-text analysis) matches trending keywords.
However, each of the advancements above has brought about their own challenges.
From a standardization perspective, this has, of course, been largely positive – increasing interoperability and minimizing data transforms. However, there will never be a “one size fits all” schema for metadata that works for all media organizations. Indeed, in many cases there is no “one size fits all” schema for the different departments within a single media organization – for example, the needs of an editor and an archivist will be quite different. To address this challenge, we have to consider the data model we use to store metadata, such that all the metadata can be stored together, but in a way that it can be easily accessed and presented to meet the needs of different users.
One element of automated media analysis that dominated discussions with the introduction of those first QC tools was the notion of “false positives”. Even more so, now as AI-based analyzers are becoming commonplace in media workflows, we’ve moved from a position where we had a relatively small number of metadata, which we’ve trusted (at least as much as the humans who generated it), to a scenario where we have huge amounts of metadata but with varying degrees of confidence about the accuracy of that data. Confidence, and confidence thresholds, now play an important role in our workflows – potentially with different thresholds in multiple workflows or various parts of the organization.
The Metadata Multiplex
Raw metadata may be useful, but it is often the case that we need to further process or combine metadata to generate real value. For example, a transcript, generated by speech-to-text process may provide us with searchable terms, but only if the text matches a typical, or related search term. However, by processing that metadata and performing a contextual analysis, we can generate context or tags that enable us to automate the categorization of the media and also improve the search. Thinking back to the previous topic of trust - we may also be able to combine metadata to increase confidence levels. For example, if a face-detection process identified “Ms Merkel” but with only 80% confidence, but the transcript and context analysis identify “German Politics”, perhaps we can increase the overall confidence level and have more trust in that metadata.
When architecting our own plans for integrating cognitive services a few years ago, our colleague, Ralf Jansen, stated an ambition of “knowing every detail about every frame”. However, there’s only value in understanding, and/or documenting, a detail if that adds value to the content, or if it saves cost in the production process. So, what sort of detail are we referring to?
1. “Internal” Details
Automated media analysis has already come a long way, and will no doubt continue to develop. Knowing who or what objects are in each frame and the context is a great enabler in production environments, perhaps even automating some production processes such as automated highlight creation. Identifying sellable products in a scene may help target specific advertisers, boosting revenues, or even passing that data downstream to target viewers directly, enabling new revenue streams.
2. “External” Details
As the media supply chain becomes increasingly connected, as well as internal details and metadata flowing outward and downstream, we will undoubtedly see data flow upstream too. For assets that have already “aired” we may see viewer data (ratings and demographics) or even predicted data for assets that haven’t aired yet, again driving production decisions, but certainly enabling organizations to receive maximum value from their media catalogues and archive. With new delivery platforms, we may see the equivalent to viral coefficients or content scores applied to assets or parts of assets, again, enabling production and distribution decisions.
The Human Touch
Increasingly, the metadata created and used in media workflows is being generated by machines and not humans – perhaps even to the extent that the task of “logging” may soon be a thing of the past. However, the human touch is still very much required and absolutely adds value in the data “chain” of the future. Firstly, knowing what type of metadata is going to add value in the media supply chain, and secondly, controlling the automated processes and confidence thresholds such that the metadata generated is of sufficient quality. This will be a significant change from current jobs or tasks and may require new skills, but ultimately depends on the same experience, creativity, and judgement.
In this article, we’ve touched more than once on using metadata to realize the value of assets. In the next article, we’ll look more at some specific aspects of that, and the administrative metadata that support it, specifically around the Media Supply Chain and Rights Management and Monetization.
Part 1 ... Standardization
Part 2 ... Metadata
Part 3 ... Rights & Monetization
Part 4 ... Remote Working
Part 5 ... Security
Part 6 ... Corporate Responsibility