Spark job to run PDGF in parallel on a Spark cluster
Official repository of Trino, the distributed SQL query engine for big data, formerly known as Pr...
Apache Doris is an MPP-based interactive SQL data warehousing for reporting and analysis.
All the things about TPC-DS in Apache Spark
A non-validating SQL parser module for Python
Sampling CPU and HEAP profiler for Java featuring AsyncGetCallTrace + perf_events
Includes notes on Apache Spark, Spark for Physics, Jupyter notebook examples for Spark and Oracle.