DecouKV Artifact

This repository contains the artifact for the paper "Mitigating Resource Usage Dependency in Sorting-based KV Stores on Hybrid Storage Devices via Operation Decoupling", accepted at USENIX ATC '25.

Overview

This artifact allows reviewers to reproduce the main experimental results of the paper, including:

DecouKV successfully improves CPU utilization compared to other systems and reduces utilization fluctuations under write-intensive workloads (Load, YCSB-A).
DecouKV achieves higher throughput than other systems under write-intensive workloads (Load, YCSB-A, -F); it also provides moderate improvements under read- or scan-intensive workloads (YCSB-B, -C, -D, -E).
DecouKV successfully reduces average latency compared to other systems under write-intensive workloads (Load, YCSB-A, -F); it also brings slight improvements under read- or scan-intensive workloads (YCSB-B, -C, -D, -E).

Experiments are conducted on four key-value store systems, including RocksDB, MatrixKV, ADOC and DecouKV.

Folder Structure

Decoukv/ ├── /db_impl/ # All systems under comparison │ ├── adoc # ADOC. │ ├── decoukv # DecouKV. This is the code we modified and designed. │ ├── leveldb0 # LevelDB │ ├── lsm_nvm │ ├── matrixkv │ ├── rocksdb ├── /workloads/ # YCSB workload A-F ├── /test_results/ # Results of experiments. │ ├── ycsb_latency.csv # Results of latency under YCSB workloads. │ ├── ycsb_throughput.csv # Results of throughput under YCSB workloads. │ ├── cpu_utilization_XXX_workloadX.png # Results of cpu utilization under YCSB workloads. ├── README.md ├── test.sh # The script for testing all systems. ├── draw_latency.py # The script for plotting latency comparison across all systems └── draw_throughput.py # The script for plotting throughput comparison across all systems

Environments

OS: Ubuntu 20.04.6 LTS with the Linux kernel 5.15. Hardware: 128 GB Intel Optane DCPMM for fast devices, and 960 GB Intel S4520 SSDs with SATA 3.0 interfaces for slow devices.

Approximate Resource Requirements

A dedicated server with at least 10 CPU cores.
One SSD with more than 200GB of free space.
One PMEM device with more than 50GB of free space.
The total runtime is approximately 60 hours.

Quick Start

To Clone the repository

git clone https://github.com/your-org/decoukv.git
cd decoukv

To init submodules

$ git submodule init
$ git submodule update

To install packages

$ sudo apt install libsnappy-dev libgflags-dev zlib1g-dev libbz2-dev liblz4-dev libzstd-dev libpthread-stubs0-dev libnuma-dev libstdc++-dev libpmem-dev liburing-dev

To run test

Configure Fast (PMEM) and Slow (SSD) Device Paths

Please edit the test.sh script before running. Specifically, modify line 7 and 8 to set the correct paths:

Set db_path to the directory on your SSD (slow device)
Set pmem_path to the directory on your PMEM (fast device)

Run all benchmarks and generate CPU utilization plots:

nohup ./test.sh > test.log 2>&1 &

This will:

Compile all four systems
Run Load, YCSB-A to F workloads
Output CPU utilization fluctuation plot (Figure 11) into ./test_results/

This step may take over 24 hours to complete. You can monitor the script’s progress in the test.log file. Reducing the data volume (default: 100GB) can shorten the runtime, but may affect the final results.

Generate throughput latency figures:

./draw_throughput.sh   # Generates Figure 14

Generate latency figures:

./draw_latency.sh      # Generates Figure 13

Clean Up Experimental Data and Build Files

./clean.sh

Running Other Experiments

Other experiments in the paper can be reproduced by modifying the following parameters:

In test.sh: You can adjust the number of threads, selected systems, number of CPU cores, and the YCSB workloads to run.
In the workload/ directory: Each YCSB workload file can be modified to change parameters such as:

Database size (recordcount)
Number of operations (operationcount)
Read/write ratio
Key distribution (Uniform of Zipfian)

In db_config.yaml: System-level configurations can be customized, including:

SSTable size
MemTable size
And other internal parameters

These options provide flexibility to explore additional workloads or system behaviors beyond the default setup.

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.github/workflows		.github/workflows
core		core
db		db
db_impl		db_impl
lib		lib
tests		tests
third_party		third_party
workloads		workloads
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
Data.txt		Data.txt
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
clean.sh		clean.sh
db_config.yaml		db_config.yaml
draw_latency.py		draw_latency.py
draw_latency.sh		draw_latency.sh
draw_throughput.py		draw_throughput.py
draw_throughput.sh		draw_throughput.sh
draw_utilization.py		draw_utilization.py
gen_compile_commands.sh		gen_compile_commands.sh
load_index.sh		load_index.sh
parse_result.py		parse_result.py
resource_monitor.py		resource_monitor.py
resource_monitor1.py		resource_monitor1.py
t.sh		t.sh
test.sh		test.sh
ycsbc.cc		ycsbc.cc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DecouKV Artifact

Overview

Folder Structure

Environments

Approximate Resource Requirements

Quick Start

Running Other Experiments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

QingyangZ/DecouKV

Folders and files

Latest commit

History

Repository files navigation

DecouKV Artifact

Overview

Folder Structure

Environments

Approximate Resource Requirements

Quick Start

Running Other Experiments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages