Cloud offers several benefits in terms of speed, efficiency and cost compared to a traditional LTO storage, and can help solve problems in production that would usually call for an on-site investment in hardware, says Jim McKenna of Facilis.
Jim McKenna is VP of Sales & Marketing at Facilis.
This is the time when budgets are being planned. For the average content creator, there are several items that could make 2020 a good year – starting with that amazing new Mac Pro, the new RED 8K camera and some additional terabytes in the storage pool for all that 8K media that you’ll be shooting. Also, the IT guys think it may be smart for us to offsite some critical data, so perhaps it’s time to think about cloud storage.
The Cloud Conundrum
On-premise back-up and archival systems have been commonplace in media production since before centralised storage was a thing. The available technology 15 years ago in AIT and DLT recorded data to tape at about the same speed as files upload to the average cloud storage today. In the age of low-capacity ATA/IDE hard drives, this wasn’t a big problem. In 2019, we saw HDDs closing in on 20TB each and tape cartridges (now LTO8) at 12TB capacity uncompressed. Since LTO generations always double in capacity, we’re due to see a 24TB cartridge soon.
Cloud, on the other hand, is limitless. Seems like a good resource to have, but consider the cost of ‘limitless’. Even though you only pay for what you use, there is little to no economy of scale. When buying an LTO system, the drive and robotic library come at a large up-front cost, but as you use it, the price per TB goes down because you’re just buying the cartridges.
It’s common to consider the nice round number of 100TB as the point of convergence between cloud and LTO, at least when deploying a small-scale library system. That number may decrease in the next generation of LTO, as you will be able to reach 100TB with only four tapes and large libraries may not be required.
Still, the idea of an externally managed, disaster-proof (multi-site replicated) store of data is very attractive, especially if your company normally flexes with the changing workload. A big back-up system to manage your big back-up jobs will sit idle in slow times. Cloud back-up costs you nothing if you delete the uploaded material at the completion of the job. It’s an OpEx, not a CapEx – if you decide it’s not worth the ongoing cost, shut it off. We’ve all seen large investments justified in this way.
However, it’s still the speed that brings some thoughtfulness to the conversation. In the early days of tape-based back-up, entire networks were rarely more than a few terabytes. With today’s increased media file sizes, and with the average speed of cloud upload at about 1TB per day, how long will it take to run a full back-up of your storage system? How much data does your facility ingest and generate in a single day during busy times? When you need to get that data back, how long are you willing to wait?
The Storage Hierarchy Workflow
Tiered storage management is big business in the enterprise world. Use of complex algorithms to cache the most-used objects and shuffle off the seldom used ones to slower media is a tenet of good enterprise storage dataflow. In content creation, however, this can leave some things to be desired. Front-end caching for speed in media production needs to represent a lot more of the capacity of the overall system because the file sizes are so much larger. When elements needed immediately fall outside that speedy cached capacity, it can be disastrous for a screening or supervised edit session. As a result, tier 1 storage is large and files are seldom moved unless they’re sure to be unused for an extended time.
This workflow lends itself to the cloud in an interesting way. Assuming the ‘warm’ category of the cloud storage account, there is no faster tier 3 for random file access. Cloud is a medium that can be used for deep archive while still maintaining instant access to the video and audio files that may be needed in production. This offers a workflow in which the entirety of a production shoot can be uploaded via an on-premise cloud cache, maintaining tier 1 access of all assets on the cache until after the initial rough cut.
Media can then be managed (moved) into an alternate location on tier 1, where it resides for the duration of the extended editorial process, and the remaining production assets are flushed from the disk but remain on the cloud location. The flushed files maintain their same locations in the directory structure on cache but take up no disk space.
The savings in tier 1 capacity can be realised immediately, and redundancy is achieved for all the production assets on cloud storage. The best part is, the shots that weren’t used in the initial rough cut are all still available and restoration of the high-bitrate master is only moments away. Add an asset management interface that saves preview clips of the uploaded assets for offline viewing, and the workflow comes together quite nicely – random access to needed assets in a familiar interface, through a common directory structure, without taking up space on expensive tier 1 storage.
Who’s Who in Cloud
Since we’re only discussing the usage model of cloud as storage, not cloud computing for image processing, analytics or virtual machines, this should be an easy choice, right? But it’s never easy with the cloud.
Search something like ‘best cloud storage’ and weed through the dozens of consumer-focussed services that simply rebrand someone else’s storage with a new edge toward desktop integration, file search and analytics, multi-device continuous synchronisation and collaborative features, to name a few. These consumer and SME-focussed services want to be your end-to-end solution, so they often don’t have the common interfaces (APIs) to live inside a larger workflow. For cloud storage to be usable in an enterprise or rich media content creation environment, it must be compatible with the internal network and have some ability to be integrated within the company applications.
Narrowing down the list to a few services that are pointed toward professional environments and integrate well within these environments yields some familiar brands and some new faces. Amazon and Microsoft have focussed on corporate environments the most, making them the top choices for many facilities. The AWS S3 interface has been adopted by some other cloud storage providers (like Wasabi) in order to leverage the existing compatibility that many media management applications have with AWS. Other providers chose to create their own interface, like Backblaze B2, considering this to be a more streamlined and less costly way to interface with cloud storage.
Regardless of how you get to the storage, these services all have one thing in common – you pay by the GB/TB. In some cases, you only pay for what you’re currently using, while in some cases you pay for any data uploaded, with a standard retention period thereafter.
In most cases, you’ll also pay for download or egress (because you may not always be downloading – you may be moving or copying within the cloud). Egress cost can put a damper on the random access of media within the workflow outlined above. More importantly, it can have a big impact on the monthly bill, and the finance department doesn’t like inconsistent costs.
The Wasabi method of calculating pricing eliminates the egress cost but replaces it with a mandatory retention period of 90 days. If you upload a file, you’re charged for that file storage for at least 90 days, no matter when you delete that file. This period is well within the range of retention for most data placed in cloud storage, so the savings in egress cost make this solution worth a look.
The Bigger Picture
Cloud can’t be everything to everyone, but it can help solve certain problems in production that would ordinarily call for on-premise hardware investment. For the big data producers in our industry, tape-based localised archive and back-up solutions may make sense, unless external file access and business continuity protection are higher priorities than cost. For those who want to integrate cloud for the purpose of structured tier 1 offload with random access, look for the companies that integrate the right cloud service into your content creation environments and offer the accountability that they’ll work for your specific need.