Loading…
PrestoCon 2020 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Presentations [clear filter]
Thursday, September 24
 

10:05am PDT

Presto at Uber - Harsha Reddy, Uber
This talk will cover:
- Overall Presto ecosystem at Uber
- Micro-service ecosystem supporting Presto-Proxy, Automation framework and others
- Presto for ETL
- Ongoing projects and Roadmap

Speakers
avatar for Harsha Reddy

Harsha Reddy

Senior Software Engineer, Uber
Harsha Reddy is a Senior Software Engineer on Uber's Data - Interactive Analytics Engineering team.His work is focused on building and enhancing the ecosystem for Data Analytics platforms in order to be more adaptive and scalable to ever growing data needs. He is primarily involved... Read More →


Thursday September 24, 2020 10:05am - 10:35am PDT

10:35am PDT

Data Lake Analytics -- Alibaba's Presto Experiences - James Xu, Alibaba
Data Lake Analytics(DLA) is a large scale serverless data federation service on Alibaba Cloud. One of its most popular serverless SQL engine is based on the well-customized PrestoDB. DLA integrates with mainstream datasources, and provides easy-to-use mysql protocol to let user interact with. It also added some key features like multi-tenancy, and one-click data warehouse. In this talk, we will introduce the system architecture of DLA SQL engine, as well as some best practice scenarios. Some key features we developed and our future plan will also be covered in this talk.

Speakers
JX

James Xu

Developer, Alibaba
Software Developer of Alibaba Data Lake Analytics Product.


Thursday September 24, 2020 10:35am - 11:05am PDT

11:05am PDT

Panel Discussion: The Presto Ecosystem - Moderated by Dipti Borkar, Ahana; Nezih Yigitbasi, Facebook; Maxime Beauchemin, Preset; Vinoth Chandar, Apache Hudi & Kishore Gopalakrishna, LinkedIn
In this panel, leaders from the open source data community will talk about the disaggregated analytics stack with Apache Superset, Apache Hudi, Apache Pinot & Presto, the problems that each of these important projects solve and how they are better together.

Speakers
avatar for Kishore Gopalakrishna

Kishore Gopalakrishna

PMC, Apache Pinot
Kishore Gopalakrishna is a founding engineer at a stealth mode startup. Prior to that, he was the architect at LinkedIn’s analytics infra team. Kishore is passionate about solving hard problems in distributed systems. He has authored various distributed systems such as Apache Helix... Read More →
avatar for Nezih Yigitbasi

Nezih Yigitbasi

Engineering Manager, Facebook
Nezih Yigitbasi is an engineering manager in the Presto team at Facebook and he currently serves as the chair of the technical steering committee of the Presto Foundation. He is an open source enthusiast and he has contributed to multiple open source big data projects, including Presto... Read More →
avatar for Dipti Borkar

Dipti Borkar

Cofounder & Chief Product Officer, Ahana
Dipti Borkar is the Cofounder, Chief Product Officer & Chief Evangelist at Ahana, the Presto company. She is responsible for all things strategy, product and community. She is also the Chairperson of the Presto Foundation, Outreach team. She has over 15 years of experience in data... Read More →
avatar for Maxime Beauchemin

Maxime Beauchemin

CEO & Founder, Preset
Maxime is the original creator of Apache Superset and Apache Superset.
avatar for Vinoth Chandar

Vinoth Chandar

Vice President, Apache Hudi
Vinoth Chandar is the original creator & VP of the Apache Hudi project, which has changed the face of data lake architectures over the past few years. Vinoth has a keen interest in unified data storage/processing architectures.Currently, Vinoth Chandar drives various efforts around... Read More →


Thursday September 24, 2020 11:05am - 11:35am PDT

12:35pm PDT

Optimizing Query Performance by Decoupling Presto and Hive Data Warehouse - Gene Pang & Calvin Jia, Alluxio, Inc.
Presto is commonly used to query existing Hive data warehouses. Due to existing applications, tech debt, or previous operational challenges, Presto may not be able to achieve its full potential but bound and limited by past decisions. Challenges include overloaded Hive Metastore, unoptimized data layouts such as too many small files, or lack of influence over existing Hive applications.

Ideally, Presto would access data independently from how the data was originally managed. Alluxio, as a data orchestration layer provides the physical data independence for Presto to interact with the data more efficiently. In addition to caching, Alluxio provides a catalog service to abstract the table metadata, and transformations to expose compute-optimized data. In this talk, Gene describes the challenges of using Presto with Hive, and discusses how Alluxio data orchestration can solve them.

Speakers
avatar for Calvin Jia

Calvin Jia

Software Engineer, Alluxio
Calvin Jia is the top contributor of the Alluxio project. He has been involved as a core maintainer and release manager since the early days when the project was known as Tachyon. Calvin has a B.S. from the University of California, Berkeley.
avatar for Gene Pang

Gene Pang

Head Architect, Alluxio, Inc.
Gene Pang is the PMC Maintainer of the Alluxio open source project and a founding member of Alluxio, Inc. He graduated with a Ph.D. from the AMPLab at UC Berkeley, working on distributed database systems. Before starting at Berkeley, he worked at Google and has an M.S. from Stanford... Read More →


Thursday September 24, 2020 12:35pm - 1:05pm PDT

1:05pm PDT

Presto at Twitter - Beinan Wang & Chunxu Tang, Twitter
At Twitter, engineers are maintaining 9 on-prem/on-cloud Presto clusters with over 3000 nodes. With the experience of developing and maintaining these clusters, Beinan and Chunxu would like to share the stories of team efforts to develop the Presto-Druid connector, empowering SQL queries for real-time data analytics; the Presto router service, consolidating Presto clusters to establish a federation system; and the query predictor service, introducing machine learning techniques to the Presto ecosystem.​​​​

Speakers
avatar for Chunxu Tang

Chunxu Tang

Software Engineer, Twitter
Chunxu is a software engineer in Twitter's Interactive Query team where he works on developing and maintaining Presto and Druid services. He received his doctoral degree from Syracuse University, where he did research on machine learning and distributed collaboration systems.
avatar for Beinan Wang, Ph.D.

Beinan Wang, Ph.D.

Sr. Software Engineer, Twitter
Beinan builds large scale distributed SQL systems (presto&hive) for Twitter's data platform team.


Thursday September 24, 2020 1:05pm - 1:35pm PDT

1:35pm PDT

Presto at Pinterest - Ashish Kumar Singh, Pinterest
As a data-driven company, many critical business decisions are made at Pinterest based on insights from data. Presto has played a key role to enable interactive querying at Pinterest. Operating Presto at Pinterest’s scale has involved resolving quite a few challenges. In this talk, Ashish Singh will share Pinterest's journey on adopting, using and enhancing Presto to meet Pinterest's interactive querying needs.

Speakers
avatar for Ashish Kumar Singh

Ashish Kumar Singh

Tech Lead, Bigdata Query Processing Platfrom, Pinterest
Ashish Singh is a tech lead on BigData Query Processing Platform team and an Apache committer. He focuses on making expressing computational needs with SQL easier, faster, and reliable. Three years ago, before joining Pinterest, Ashish used to work on various big data projects, primarily... Read More →


Thursday September 24, 2020 1:35pm - 2:05pm PDT

2:05pm PDT

How Agari Replaced their Custom ETL and Data Analytics Pipelines with Serverless Presto using Amazon Athena - Jonathan Chase, Agari & Roy Hasson, Amazon
Amazon Athena is a fully managed serverless service for querying and analyzing data in Amazon S3 using the power of Apache Presto.  In this session you will be introduced to Athena and learn how Agari, an email threat prevention and protection service, built a data processing and analytics platform leveraging the capabilities of Presto and the ease of use of Amazon Athena to improve threat detection.

Speakers
avatar for Roy Hasson

Roy Hasson

Sr. Manager, Business Development, Amazon
Roy Hasson is WW Analytics Specialist leader at Amazon Web Services, where he helps transform organizations using data, analytics and machine learning.  Roy serves as an expert advisor to customers across all industries to transform their business and become a data driven organization... Read More →
avatar for Jonathan Chase

Jonathan Chase

Director of Engineering, Agari
Jon is a Director of Engineering Agari.  He loves all things horizontally scalable and distributed and has spent the last 20 years working on various aspects of distributed systems, security, and big data.  Jon is the author of the Apache Camel AWS Athena component.



Thursday September 24, 2020 2:05pm - 2:35pm PDT
  Presentations
  • Session Slides Included Yes

3:05pm PDT

Panel Discussion - Presto, Today, and Beyond - Moderated by Dipti Borkar, Ahana; David Simmen, Ahana; Biswapesh Chattopadhyay, Facebook & Girish Baliga, Uber
Today Presto is widely adopted for many use cases at Facebook, Uber and across the community. In this panel, learn more about the future of Presto with Biswapesh Chattopadhyay, Girish Baliga, David Simmen and Dipti Borkar. They will discuss the exciting innovations being planned and worked on and the shared vision for Presto.

Speakers
avatar for Dipti Borkar

Dipti Borkar

Cofounder & Chief Product Officer, Ahana
Dipti Borkar is the Cofounder, Chief Product Officer & Chief Evangelist at Ahana, the Presto company. She is responsible for all things strategy, product and community. She is also the Chairperson of the Presto Foundation, Outreach team. She has over 15 years of experience in data... Read More →
avatar for David Simmen

David Simmen

CTO & Co-Founder, Ahana
As Cofounder and CTO, David oversees Ahana’s technology strategy and drives product innovation. David joined Ahana most recently from Apple where he engineered iCloud database services. Prior to Apple, he was Chief Architect with Splunk and named the first Fellow in the company’s... Read More →
avatar for Girish Baliga

Girish Baliga

Engineering Manager, Uber
Girish manages Presto, Pinot, and Flink teams at Uber. Before that, he spent over a decade optimizing resources, search ads, and geo data analytics at Google, interrupted by a brief start-up stint at Urban Engines. He has a PhD in Computer Science and an MS in Math from UIUC.
avatar for Biswapesh Chattopadhyay

Biswapesh Chattopadhyay

Tech Lead, DI Compute, Facebook
Biswapesh currently leads the Data Infrastructure Compute Engines (Presto, Spark, Streaming, etc.) and Modernization efforts at Facebook. He is a deep domain expert on Big Data with over two decades of experience in exa scale data infrastructure and distributed systems. He is the... Read More →


Thursday September 24, 2020 3:05pm - 3:35pm PDT

3:35pm PDT

The Practice of Presto & Alluxio in E-Commerce Big Data Platform - Wenjun Tao, JD.COM
JD.com is one of the largest e-commerce. In big data platform of JD.com, there are tens of thousands of nodes and tens of petabytes off-line data that require millions of spark and MapReduce jobs to process every day. As the main query engine, thousands of machines work as Presto nodes and Presto plays an import role in the field of In-place analysis and BI tools. Meanwhile, Alluxio is deployed to improve the performance of Presto. The practice of Presto & Alluxio in JD.com benefits a lot of engineers and analysts.

Speakers
avatar for Wenjun Tao

Wenjun Tao

Big Data Platform Engineer, JD.COM
Wenjun Tao, Senior Software Engineer @ JD.com, core member of Presto Team. Graduated from the School of Computer Science, Beijing Institute of Technology. The main research interest during the master's degree is distributed and parallel system.



Thursday September 24, 2020 3:35pm - 4:05pm PDT
  Presentations
  • Session Slides Included Yes

4:05pm PDT

Presto at Facebook: Today and Tomorrow - Biswapesh Chattopadhyay, Facebook
The query engine landscape at Facebook, current and upcoming challenges in data, and how we are thinking about evolving Presto over the next few years to take it to the next level"

Speakers
avatar for Biswapesh Chattopadhyay

Biswapesh Chattopadhyay

Tech Lead, DI Compute, Facebook
Biswapesh currently leads the Data Infrastructure Compute Engines (Presto, Spark, Streaming, etc.) and Modernization efforts at Facebook. He is a deep domain expert on Big Data with over two decades of experience in exa scale data infrastructure and distributed systems. He is the... Read More →



Thursday September 24, 2020 4:05pm - 4:35pm PDT
  Presentations
  • Session Slides Included Yes
 
  • Timezone
  • Filter By Type
  • Breaks & Networking
  • Keynote Sessions
  • Lightning Talks
  • Presentations
  • Sponsored Lightning Talks
  • Session Slides Included