Dataproc Lead, Spark, OSS Technologies, Google Cloud
- linkCopy link
- emailEmail a friend
Minimum qualifications:
- Bachelor’s degree or equivalent practical experience.
- 5 years of experience with software development in one or more programming languages, and with data structures/algorithms.
- Experience in software development and engineering, incorporating design methodologies, leveraging open source technologies, and working with distributed computing systems, including Apache Spark, Apache Hadoop, and Apache Hive.
- Experience in Open Source technologies, Big Data, Data Analytics, Artificial Intelligence, Machine Learning, and Database Internals.
Preferred qualifications:
- Experience with database optimizations such as query and executor optimizations.
- Experience with data lakes like Apache Iceberg, Apache Hudi, Delta Lake, etc.
- Experience with Open Telemetry, JMX and other monitoring solutions.
- Experience with OSS projects like Spark, Hive, Trino, Ray, Flink etc.
- Experience working with data science tools such as Jupyter notebooks.
- Experience developing Cloud or SaaS products.
About the job
Google Cloud's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google Cloud's needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. You will anticipate our customer needs and be empowered to act like an owner, take action and innovate. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.
Cloud Dataproc enables open source data analytics users (Apache Hadoop, Spark, Trino, Flink, etc.) to lift and modernize their workloads into the cloud. Dataproc is a fast, easy-to-use, fully managed cloud service for running Apache Spark, Apache Hadoop and dozens of other OSS software in a simpler, performant and cost-efficient way. Dataproc also easily integrates with other Google Cloud Platform (GCP) services like BigQuery, Dataplex (governance, lineage), Catalog Stores to give a powerful and complete platform for data processing, analytics, and machine learning.
Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.
Responsibilities
- Build high-impact customer-facing features which make Cloud Dataproc the best place to run Spark, Ray, Trino, Flink and newer technologies in the cloud.
- Define the roadmap for Open Source technologies like Spark, Ray, Trino, Flink, etc.
- Define and implement the next generation Data Lakes and Lake Houses focusing on technologies like Iceberg, Hudi and Delta.
- Optimize the open source technologies for performance and efficiency.
- Design and build software stack to take advantage of Google technologies for faster cluster setup, efficient cluster operations, comprehensive monitoring and observability.
Information collected and processed as part of your Google Careers profile, and any job applications you choose to submit is subject to Google's Applicant and Candidate Privacy Policy.
Google is proud to be an equal opportunity and affirmative action employer. We are committed to building a workforce that is representative of the users we serve, creating a culture of belonging, and providing an equal employment opportunity regardless of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition (including breastfeeding), expecting or parents-to-be, criminal histories consistent with legal requirements, or any other basis protected by law. See also Google's EEO Policy, Know your rights: workplace discrimination is illegal, Belonging at Google, and How we hire.
If you have a need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
Google is a global company and, in order to facilitate efficient collaboration and communication globally, English proficiency is a requirement for all roles unless stated otherwise in the job posting.
To all recruitment agencies: Google does not accept agency resumes. Please do not forward resumes to our jobs alias, Google employees, or any other organization location. Google is not responsible for any fees related to unsolicited resumes.