Senior Cloud Data Engineer

Job description


GPR is radically accelerating the arrival of self-driving vehicles by tackling some of the most challenging problems that stand in the way of safe and reliable navigation.

Every road in the world has a unique subsurface signature. GPR uses radar to create a map of those subsurface signatures from which self-driving cars can navigate. Vehicles using GPR are unaffected by common but challenging road conditions like snow, heavy rain, fog, or poor lane markings.

GPR is working with leading autonomous vehicle and traditional automotive companies, is backed by leading investors, is growing quickly, and is building a talented team that wants to transform the future of mobility and work on some of the hardest and most important engineering problems around. If that sounds like you, please drop us a line.


As a Senior Cloud Data Engineer, you will work with our cloud architect to shape and implement GPR's cloud infrastructure for our radar-based map and sensor fusion-based localization algorithms. Your work will involve developing and implementing the cloud infrastructure for large scale Mapping for autonomous vehicle localization with a focus on radar maps. You will focus on building a scalable system that will interact with fleets of vehicles. We employ software based approaches to solve complex infrastructure challenges and automate those solutions. We have a strong focus on using engineering and software development approach to manage and scale our cloud infrastructure.


  • Design, develop, deploy and document the data infrastructure which include distributed big data platform, data lake on top of cloud storage system (S3)

  • Design and maintain metadata systems, data catalogs, data governance, data search and discovery and related services

  • Developing and maintaining data platform solutions in accordance with the best practice

  • Development of reliable data pipelines

  • Troubleshoot and test for security, performance, and availability of the resolution of production systems

  • Assist developers in generating new data based solutions relevant to problems they are tackling

  • Train and educate team members on the implementation of new data platform and technologies

  • Owns and extends the GPR HD-MAP data pipeline through the collection, storage, processing, and transmission of gpr datasets.

  • You’re comfortable thinking about the big picture and the small details. You enjoy building strong designs

  • Enjoy working with small, high output teams in a fast-paced startup environment.

  • A “get-it-done” person. You know that done is better than perfect and are energized by constantly delivering and moving things forward.


  • Experience in building & maintaining reliable & scalable ETL on big data platforms as well as experience working with varied forms of data infrastructure, SQL/NoSQL database, data warehouse data lake, Spark, columnar data storage etc.

  • Experience in working closely with data analysts, data scientists, gathering technical requirements, ensure the collected data is of high quality and optimal for use

  • Demonstrate ability in understanding data sources, participating in design, and providing insights and guidance on database technology, data modeling, dataOps best practices

  • Bachelor's or Masters in Computer Science or comparable engineering degree

  • Programming experience in Python/Scala/Java

  • Solid written and verbal communications skills

  • If you are planning to work remotely, you must be a strong and talented remote worker and a solid team player. You will need to spend 2-3 weeks on site up front to meet the team and learn the technology. You will need to be on board with traveling for work engagements/sites where needed. You should plan to be in the office in person at least 1 week every 2 months and based on work and team needs


  • Experience with opens source metadata management, data catalog system such as Apache Atlas is highly desirable

  • Experience with build and maintain data lake with Apache hudi open sources

  • Experience in deploying and maintaining Spark/Flink on k8s

  • Demonstrate sophisticated troubleshooting and performance tuning capabilities for apache spark based big data platform

  • Experience with Apache Kafka/Pulsar etc technologies

  • Knowledge in advanced data cache/orchestration with Apache Allexio, Apache ignite etc.

  • Experience in real-time streaming pipeline with flink, kafka etc.


  • Must be currently eligible to work in the US. Please indicate if you need or will eventually need sponsorship on your application.

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.