Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
runner		runner
site		site
.gitignore		.gitignore
README.md		README.md

Repository files navigation

Big Data Benchmark

Based on the Berkeley Big Data Benchmark, this repo has scripts to make it easy to:

deploy an up-to-date HDP cluster on EC2
copy data for the Intel Hadoop Benchmark and TPC-H from S3
convert the data sets to Parquet, ORC and RCFile
run and time Intel Hadoop Benchmark queries and a subset of TPC-H queries

The framework aims to support:

Hive-on-Tez
Shark
Presto
Impala

Engine support is currently in development.

About

Large scale query engine benchmark

amplab.cs.berkeley.edu/benchmark/

Report repository

Releases

No releases published

Packages

No packages published

Languages

JavaScript 71.3%
HTML 17.0%
Python 9.4%
CSS 1.6%
Other 0.7%