Presto
Presto is a distributed SQL query engine for big data, it allows users to query a variety of data sources such as Hadoop, RDBMS, Mongo etc. One can even query data from multiple data sources within a single query.
Hive
Hive is a data warehouse software built on top of Apache Hadoop for providing data query and analysis. It translates SQL queries into multiple stages of MapReduce and it is powerful enough to handle huge numbers of jobs
Comparisons
Tables | Presto | Hive |
---|---|---|
best of use | Interactive queries | Large data aggregation(join for example) |
sql standardized fidelity | ANSI SQL | HiveQL |
window function | yes | yes |
optimized target | latency | query throughput |
execution mode | Intermediate data in memory | Spills intermediate data to File System |
fault tolerance | No | Yes |
speed | 2 times faster than hive | slow |
data processing models | push model | pull model |
limitation | limited at the maximum amount of memory that each task can store |