Data Processing Principal Engineer

DataPelago

DataPelago

Posted on Apr 2, 2025

Lead architecture, design, and implementation of next-generation data processing engine built to exploit accelerated computing.

The developer will have the following specific responsibilities in achieving these objectives.
Responsibilities

Work with cross-functional teams to advance the architecture at the core of Datapelago’s parallel and distributed execution engine based on accelerated computing

Lead the execution engine team in the design, implementation, and rollout of enterprise grade, highly reliable acceleration engine for data processing

Design, implement, test, and maintain major components of the execution engine Analyze and identify areas for differentiation and improvements in the execution engine Collaborate with other teams and team members to drive code reviews, design reviews,

performance and reliability reviews, and development process and to drive continuous improvement in all of these

Qualifications

B.S. EE/CS or equivalent with 15+ years experience or MS with 10+ years experience 10+ years of experience developing core components of an enterprise-grade database or analytics execution engine serving large-scale data processing workloads. Experience developing for platforms such as Apache Spark, Gluten, Velox, DataFusion preferred. Experienced developing high-performance parallel implementations of data processing operators and functions, such as joins, aggregations, sorts

Experience leading 10+ teams in designing, developing, and releasing high-performance data processing engines for large production deployments

Strong programming ability in C, C++, and Rust

Strong development experience on Linux platforms