Optimizing Storage Costs of Large News Archives Through Dynamic Management of Video Bit Rates

The largely unpredictable long-term editorial value of most news content makes managing news archives frustrating from a business and cost perspective. A relatively small percentage of stories and video assets will be re-purposed and monetized in the future, with the remaining content consuming costly storage resources but ultimately never used. This paper explores a strategy that preserves all assets but also controls storage costs, by dynamically reducing the file size of archived clips when configurable criteria indicate that they are less likely to be used in the future.

Introduction

News archives are inherently frustrating from a business perspective because while some content will ultimately prove to be very valuable and important, much of the archived material will likely never be used or monetized. The ROI in a news archive resides in a relatively small number of stories and video assets, with that value only realized at a future date. The rest of the archived content – that which is never ultimately used – is simply a drain on resources. The obvious problem is that it is difficult, if not impossible, to accurately predict which stories will have future  editorial value, and which ones will not.

However, the editorial behavior of most news archives does generally follow a consistent pattern: the longer a story or asset remains in storage without being touched, the less likely it is that it will be used in the future. There are exceptions to this model, of course, but many of these are known – for example, major world events that are likely to be commemorated annually, decennially or even later – and the content can be flagged at the time it is saved based on the user’s editorial judgement. Understanding this general rule presents the opportunity for many sites to reduce their overall storage costs by dynamically managing the characteristics of the video files they are storing, without deleting content.

The cost of archive storage is a function of the number of stories in the archive, the length of time the story is saved (i.e. its age), and the size of the video files associated with each story. This is generally true regardless of whether your storage medium is solid state memory, spinning disk,
digital tape or the cloud. As such, reducing the size of unused – or highly unlikely to be used – video files directly results in lower storage costs.

Historically, once video has been written to the archive, it was assumed that the file will never change. For some organizations this may be the correct business decision, but for others, the ability to reduce video file sizes later during the lifecycle of some content may represent an opportunity to reduce costs and/or enable more content to be saved in the storage they already own.

Making video file sizes smaller typically involves lowering their bit rate, which can have an impact on quality – although not necessarily to a degree that will be noticeable to viewers in the resulting productions. If the thought of slightly reducing the quality of a file that hasn’t been previewed or restored in a year or more makes you run from the room, this option is not for you. However, if it makes business sense for you to explore reducing the size of long-unused video files to distribution bit rates, you might be able to achieve a 50% or greater reduction in long-term archive storage costs.

Balancing Bit Rates

News video is most often archived at production bit rates, which produce large files. Archiving at these high bit rates allows the video to be returned to production workflows at a later date with no loss of quality. However, we should keep in mind that all production files are eventually knocked
down to distribution bit rates – which are much lower – for delivery to consumers. If we can accept saving unused files at bit rates that would normally be used for distribution, we can significantly save cost and storage.

Consider that:

  1. In general, the longer a story is kept in the archive without being used, the less likely it will be used in the future.
  2. Most news video is produced at bit rates of 25-50Mbps or higher.
  3.  Distribution bit rates are generally in the range of 4-12Mbps.

What if, based on the time that has passed since a story was written to the archive or last retrieved, an automated process could transform the corresponding video file from production to distribution bit rates? In effect, this is a conscious trade-off in predicted story value versus storage cost. Given the large number of stories in a typical news archive, we know that most of this content will never be used again. However, we also know that a very small number of stories and assets will, against all odds, become relevant again after a long time in deep storage. The fact that we never actually delete the story hedges our bet while still allowing significant savings in storage.

How big an impact might changing the video bit rate have? As an example, let’s look at 120 TB of storage, equivalent to a 50-slot digital tape library with LTO6 drives as commonly used in many television stations. When archiving at 50Mbps, a station that produces and archives about 20 hours of content per week can store roughly five years of content.

If we instead archived all stories at 10 Mbps, that 120TB of storage could contain 25 years of content in the same device. And if even that big boost isn’t enough to convince you of the storage savings potential, consider the capacity gains if the same tape library is upgraded with LTO7 drives. The increased storage density would give the station more than 280GB of capacity in the same physical footprint, into which a whopping 60 years of content could be archived.

Storage capacity comparison
Storage capacity in years of a 50-slot tape archive at a station archiving 20 hours of content per week.

But of course, we don’t want to archive everything at 10Mbps; we want to keep the “good stuff” – the content most likely to have future value – at production quality. How can we do that, without requiring significant, ongoing, manual media management effort? Efficient, automated workflows, coupled with a mechanism for enabling journalists to flag the “good stuff”, enable the bit rates of archived video to be optimized dynamically without requiring any user actions after the initial archiving.

First, during the production process, producers and editors use their newsroom computer system (NRCS) to flag stories they expect to have long-term value greater than one year. Journalists generally have a good feel for this. Earthquakes, tornados, plane crashes and presidential visits get flagged; skiing squirrels, backyard weather and annual festivals do not.

Regardless of the setting of this “important” flag, all story video is initially archived at the production bit rate. When content reaches an age of 18 months, an automated process examines it to see if it has been previewed or used since it was written to the archive. While this time frame can be customized to suit the station or group’s business needs, the 18-month window proposed here takes into account the common re-use of video on an event’s first anniversary. If it has not been accessed during this period, and the “important” flag is not set, the file is automatically transcoded to reduce it to a distribution bit rate.

diagram2-sept-2016-flowchart_v4Because up to 90% of content in an archive with at least five years of stored assets will eventually meet these criteria, the number of stories that can now fit into the station’s existing storage device is dramatically increased, without actually expanding its capacity.

diagram3-sept-2016-contentpool_v2-rev
Content that is flagged as important, less than 18 months old, or has been previewed or used since archiving is kept at production bit rates; all other content is reduced to distribution bit rates.

The Automated Optimization Advantage

The biggest challenge and frustration in managing news archives is the fact that a very large number of archived assets will never be touched again after they are saved. There really is no way around this; it is the very nature of news that we can’t predict with any certainty which stories will or won’t develop or be related to others over time. In general, it’s better to keep everything than throw something away, only to find you need it later. Unfortunately, this means retaining a lot of content that likely never will be used. Content that is never used again effectively has no value, but it continues to consume storage resources, essentially forever.

Some prognosticators say that continuously dropping storage prices will make this a non-issue, but we disagree. The rate at which the price of storage has been dropping has slowed, while the amount of storage we use continues to grow, driven by the need to not only produce more content but also to do so at higher resolutions. It would be unrealistic to assume that storage costs will continue to drop until they reach nothing; storage will never be free.

Meanwhile, the amount of content archived by most news organizations has increased significantly over time, offsetting to a considerable degree any savings from declining storage prices. And finally, assets in news archives are generally never purged or deleted (nor should they be). Thus, effectively, you have a very large and growing number of unused stories, multiplied by a relatively small storage cost each, multiplied again by an essentially infinite amount of retention time – which adds up to a not-so-insignificant cost.

While these costs cannot be entirely eliminated, dynamic management over time of the bit rates of the archived video can lead to significant storage savings with only a minor trade-off in the final distribution quality of unexpectedly re-used material. By implementing this model in a software solution that combines fully automated workflows and transcoding with seamless integration into users’ existing NRCS, these benefits can be realized with no additional operational overhead – the best of all worlds.