operation, data placement efficiency on disk drives is not a priority. Files can be modified by the inclusion of additional blocks of data, but this operation, again, does not guarantee efficient data placement. The end result is called file system fragmentation, resulting in decreased efficiency as a file system ages. Typical disk drives are very efficient when used in contiguous operations, but file system fragmentation causes the need for increased disk seek operations; an average seek operation is about 15 milliseconds (ms), which is quite long considering that, at 30 frames per second, this represents the duration of about half the length of an entire frame of media.
We started to study file system operations and even counted instructions and machine cycles required to create files of specific lengths. We then started to think about the efficiencies that could be realized if we could consider the data immutable rather than amendable. Considering that a great deal of released and distributed media data is, in fact, immutable, it seemed reasonable to create a file system that was specifically built to efficiently store this type of data.
After several years of development and testing, we released our first object-based file system for immutable data called the Web Object Scaler (WOS). This data storage software was designed specifically for data placement and retrieval efficiency for immutable media data. The concept of V-nodes in directory structures and I-nodes in file allocation tables was eliminated completely and replaced with a very efficient, single layer, data placement process that does not have to rely on locking for data consistency. Data is purposefully grouped together in “buckets” of like size. For example, all 1 MB objects are kept together such that if one is deleted, another can take that same location without creating fragmentation or any degradation of performance. Each object consists of an assemblage of data, metadata, a checksum for guaranteeing data consistency, and a distribution policy that can be modified over time based upon distribution requirements. Very large objects are stored specifically on multiple disk drives, which can be accessed simultaneously to guarantee consistent delivery. Finally, we developed several methods of failure resilience that go well beyond simple replication and allow orderly data migration, which is critical in the maintenance of large scale storage systems. The end result is a data storage technology that is so efficient that it can be filled to 99 percent of its capacity and still function perfectly. The addressing scheme allows the storage of 1 trillion objects in effectively one namespace. (continued on next page)
BROADCAST BEAT MAGAZINE
IBC Issue September 2015
53