Our ability to collect data is rapidly outstripping our ability to effectively store and use it. Organizations are therefore facing tough decisions of what data to archive (or dispose of ) to effectively meet their business goals. PHOcus addresses this problem in the context of image data (photos) by proposing which photos to archive to meet an online storage budget. The decision is based on factors such as usage patterns and their relative importance, the quality and size of a photo, the relevance of a photo for a usage pattern, the similarity between different photos, as well as policy requirements of what photos must be retained. We formalize the photo archival problem and give an efficient algorithm with an optimal approximation guarantee. We then demonstrate our system, PHOcus, on an e-commerce application as well as with personal photos on a smartphone, and discuss how many of the inputs to the problem can be automatically obtained.
|Number of pages||4|
|Journal||Proceedings of the VLDB Endowment|
|State||Published - 2022|
|Event||48th International Conference on Very Large Data Bases, VLDB 2022 - Sydney, Australia|
Duration: 5 Sep 2022 → 9 Sep 2022