datazeit GmbH

Developer - Web Crawling and Data Engineering

Job description

Job description

about us.
We enable AI-powered product creation by making fact-based decisions and leveraging the value of consumer and product data - because we believe that companies can only succeed in today’s world by being pioneers at every stage of the product life cycle. We don’t shy away from new technologies, but embrace and create them. To help our customers be ahead of the industry, we are developing not only the largest, but also the most detailed knowledge graph for products in the world, building a unique and massive amount of data that we crunch to create meaningful information. By matching data from various web sources and social networks we identify early signals, microtrends and hidden seeds before they skyrocket.

about you.

We are looking for a talented, passionate, and pragmatic engineer, able to work in a rapidly changing environment. You will work on large-scale web crawlers to extract structured/unstructured data from multiple online sources. If you love playing with data and code, have an eye for detail, and a strong client delivery mindset, we’d love to hear from you!

what we offer.

  • A competitive salary.
  • Responsibility from day one in a fast growing and global startup.
  • A vibrant international team.
  • Flexible working hours and the possibility to work remotely some days a week.

key responsibilities.

  • Develop a deep understanding of our vast data sources on the web and know exactly how, when and which data to scrap, parse and store.
  • Develop frameworks for automating and maintaining constant flow of data from multiple sources.
  • Look to solve problems with an optimistic approach and take on challenges.
  • Implement best practices and patterns in design, development, and deployment.
  • Creating, using and maintaining web scraping libraries so that future scrapers can be built faster.
  • Work independently with little supervision to do research and test innovative solutions skills.
skills required.

Core skills and experience we are looking for:
  • 2-6 years of experience in NodeJS, Python with frameworks like Puppeteer, Selenium and others.
  • Experience with data parsing - knowledge of Regular expression, HTML, CSS, DOM, Javascript and XPATH.
  • Experience with network protocol analyzer like WireShark, Charles Proxy.
  • Experience with storing structured & unstructured data in Postgres and AWS S3.
  • Strong knowledge of performance optimization, multi - processing, multi - threading and concurrency.
  • Experience with AWS cloud services.
  • Good knowledge in web scraping and APIs.
“That’s me” you’re thinking? “That’s what I’m looking for”? Then what are you waiting for? Contact us!

Please let the company know that you found this position on this Job Board as a way to support us, so we can keep posting cool jobs.