Ph.D. Alumnus: Sari Sultan
Reference:
Sari Sultan
Configuring In-Memory Caches: From TTL-Aware Sizing to Interval-Based Historical Analysis with HistoChron
Ph.D. Thesis, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada, 2024.
Supervisor(s):
Michael Stumm
Download Thesis:
Abstract:
In-memory caches such as Memcached and Redis are crucial for enhancing the performance ofdistributed systems by significantly reducing query response times. Correctly sizing these caches iscritical, especially considering that prominent organizations use terabytes to petabytes of DynamicRandom Access Memory (DRAM) for these caches. Configuring these caches to operate efficientlyremains a challenging task, considering the dynamic nature of modern workloads where cachingrequirements can change significantly over time.
Our thesis is that the state-of-the-art for in-memory cache performance analysis does notaccommodate modern workloads. This gap is evident in the lack of consideration for Time-to-Live(TTL) attributes and heterogeneous object sizes, as well as the absence of interval-based historicalanalysis to address the dynamic nature of these workloads. This dissertation introduces acomprehensive reevaluation of in-memory cache performance analysis tools. We propose novel toolsthat account for TTL attributes and heterogeneous object sizes, and we introduce a new tool thatenables efficient interval-based historical analysis of in-memory cache workloads. In particular, oneof our primary contributions is the development of Miss Ratio Curve (MRC) generation andWorkingSet Size (WSS) estimation algorithms that accommodate TTL attributes and heterogeneous objectsizes. Our analysis of real-world cache workloads demonstrates that including TTLs can lead to anaverage reduction in cache memory footprint by 69%, and up to 99%.
Additionally, we introduce HistoChron, a novel methodology with a Graphical User Interface(GUI) that enables efficient interval-based historical analysis of caching workloads. Evaluated on over5, 000 cache access traces from six real-world datasets, encompassing more than 300 billion accessesover an 18-year span, HistoChron demonstrates its efficacy by generating exact MRCs over anyarbitrary time interval using just 24MiB of storage space weekly. We also present a lower-overheadvariant of HistoChron that generates approximate results with a mean error of less than 1%. Thesecontributions advance the field of in-memory cache management, offering a robust framework foroptimizing in-memory caches in alignment with the dynamic demands of modern workloads.
Keywords:
Memory Management, In-memory Caches, TTL, MRC-generation, Working Set Size
BibTeX:
@phdthesis(Sultan-PhD24, author = {Sari Sultan}, title = {Configuring In-Memory Caches: From TTL-Aware Sizing to Interval-Based Historical Analysis with HistoChron}, school = {Department of Electrical and Computer Engineering, University of Toronto}, address = {Toronto, Canada}, supervisors = {Michael Stumm}, month = {August}, year = {2024}, keywords = {Memory Management, In-memory Caches, TTL, MRC-generation, Working Set Size} )