Data Processing Software Developer

DataPelago

DataPelago

Software Engineering
Mountain View, CA, USA
Posted on Oct 2, 2024

Implement software to perform data processing operations efficiently on large data sets as part of an overall data processing engine. Work closely with technical leads and other individual developers to develop this software and ensure data is processed accurately, efficiently, and reliably as part of the overall analytics engine. Investigate and address issues in the software and develop necessary enhancements in an ongoing manner.

The developer will have the following specific responsibilities in achieving these objectives.

Responsibilities

  • Working with cross-functional teams to deliver new algorithms and techniques in the core Datapelago GPU-powered, parallel and distributed execution engine
  • Developing enterprise-grade, highly reliable code and owning the end-to-end delivery of solutions ensuring smooth production rollouts
  • Designing, implementing, testing, and maintaining enhancements in different components of our engine. Developing highly modular components and ensuring clean integration with the entire stack
  • Define algorithms, mechanisms, procedures, and policies for the required functionality. Document them, and secure review and approval from the technical lead(s)
  • Continuously analyzing and identifying areas for differentiation and improvements in the execution engine
  • Identifying bottlenecks in scaling and performance and implementing solutions
  • Investigate and resolve all bugs before and after the release of the software
  • Support troubleshooting and resolution of issues, incidents, and problems, working in collaboration with others as needed. Document findings, resolutions, and identify required enhancements to avoid recurrence
  • Collaborating with other teams and team members and assisting with code reviews, design reviews, and development process improvements

Qualifications

  • S. EE/CS or equivalent with 5+ years of experience or MS with 3+ years’ experience
  • 5+ years of experience developing database execution engine components including core database operations - joins, aggregations, sorts, analytic functions like window functions, etc., in an enterprise-class database serving large-scale data processing workloads
  • Experience developing modern analytic databases and familiarity with vectorized execution, parallel and distributed database algorithms, memory management, etc.
  • Good understanding of SQL and familiarity with Lakehouse architectures and technologies
  • Solid experience in developing performant code that minimizes overheads and latency and maximizes throughput
  • Experienced with the best practices in developing high quality, reliable software
  • Experience developing, evaluating, and troubleshooting high-performance software
  • Strong programming ability in C, C++. Rust experiences a strong plus
  • Strong development experience on Linux platforms