IaaS clouds are becoming a standard way of providing elastic compute capacity at an affordable cost. To achieve that, VM provisioning system has to optimally allocate I/O and compute resources. One of the significant optimization opportunities is to leverage content similarity across VM images. While many studies have been devoted to de-duplication of VM images, this paper is to the best of our knowledge, the first one to comprehensively study the relationship between the VM image similarity structure and the runtime I/O access patterns.
Our study focuses on block-level similarity and I/O access patterns, revealing correlations between common content of different images and application-level access semantics. Furthermore, it also zooms on several runtime I/O access pattern aspects, such as the similarity of the sequences in which common content is accessed. The results show a strong tendency for access pattern locality within common content clusters across VM images, regardless of the rest of the composition. Furthermore, it reveals a strong tendency for read accesses to refer to the same subsets of blocks within the common content clusters, while preserving the same ordering. These results provide important insights that can be used to optimize on-demand VM image content delivery under concurrency.