Data Scientist 4

Location: Bengaluru, Karnataka

*** Mention DataYoshi when applying ***

Company Description

At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible.

At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we’ve been doing just that. Our technology helped people put a man on the moon.

We are a key partner to some of the largest and highest growth organizations in the world. From energizing the most competitive gaming platforms, to enabling systems to make cities safer and cars smarter and more connected, to powering the data centers behind many of the world’s biggest companies and public cloud, Western Digital is fueling a brighter, smarter future.

Binge-watch any shows, use social media or shop online lately? You’ll find Western Digital supporting the storage infrastructure behind many of these platforms. And, that flash memory card that captures and preserves your most precious moments? That’s us, too.

We offer an expansive portfolio of technologies, storage devices and platforms for business and consumers alike. Our data-centric solutions are comprised of the Western Digital®, G-Technology™, SanDisk® and WD® brands.

Today’s exceptional challenges require your unique skills. It’s You & Western Digital. Together, we’re the next BIG thing in data.

Job Description

Role Description

The Data Engineer is responsible for supporting Snaplogic ETL, AWS Redshift, RDS, and Aurora environments, exploring performance improvement opportunities, and troubleshooting data ingestion and user queries’ issues. This individual will perform typical database administrative tasks such as managing user accounts, monitoring system capacity usage and performance, balancing system workload, identifying root cause of issues raised by users and provide solution proposals, as well as supporting and enhancing data ingestion processes developed by other development teams. This individual will work with business users to identify performance improvement opportunities and implement them in the order based upon urgency and priority. In addition, this individual will identify data integration, ETL and quality issues, investigate issues identified by others, research and implement solutions to correct problems (at the source where possible), collaborate with developers to correct historical data, develop reports and monitoring programs, and execute data integration initiatives.

Job Responsibilities:

  • Manage user accounts and security in the Redshift, RDS, and Aurora environments
  • Monitor and manage system workloads in order to keep all the systems running smoothly and minimize bottlenecks.
  • Configure system connections between data sources and targets.
  • Assist users in debugging query failure issues and data quality problems
  • Analyze objects such as tables and views designed and deployed by the development teams and users, provide best practice solution proposals
  • Analyze queries from users, identify performance improvement opportunities, and assist users to improve performance
  • Analyze and manage disk space usages by users, and provide best practice to development teams and business users
  • Maintain, modify and enhance data ingestion processes per development teams request
  • Develop and deploy data load processes per business users request, including tables, views, and data load ETL pipelines’ design and development
  • Create native connections between Redshift, RDS, and Aurora
  • Develop and maintain metrics programs to monitor system usages and performance
  • Manage data Redshift archiving processes
  • Develop Redshift Spectrum objects
  • Support Engineering and Manufacturing Datawarehouse and related data systems and initiatives
  • Solve moderate to complex data quality/ data integration problems
  • Recognize and investigate anomalies and propose solutions
  • Research and recommend innovative, and where possible automated, approaches for data issues
  • Monitor ETL pipelines, troubleshoot and address any issues.
  • Liaison with business users and Subject matter Experts to find permanent solution to recurring issues.
  • Work closely with managed service providers in streamlining production support processes.

Required Qualifications:

  • Bachelor’s degree in computer science, Information Systems, (or a related field) or a minimum of seven years of relevant professional experience and training
  • 7 to 9 years Data Engineer or Support Engineer roles
  • 3+ year AWS Redshift experience
  • 2+ year experience in performance tuning and optimization of any SQL databases
  • 3+ year supporting ETL packaged software like Informatica or SnapLogic
  • Familiarity with concept of and experiences in performance tuning approaches related to MPP databases
  • Proficient in SQL
  • Experiences in data modeling, including logical and physical design, normalization and denormalization
  • Expertise in performance tuning in one or more RDBMS such as Postgres, MySQL, Oracle, DB2, MS SQL, etc.
  • Experiences in programing tools such as Linux shell scripting, Python, Java, etc.
  • Experiences in analysis, Design, Development and Implementation of ETL and exposure to any commercial ETL tools (Snaplogic, Informatica, Alteryx, Matillion) and reporting (Tableau, Spotfire) tools will be an advantage but not mandatory.
  • Effectively and efficiently work with an enterprise environment involving very large data sets - millions of records in both databases and flat files
  • Experienced in requirements analysis, design, and development in the data Management application areas
  • Responsible for working with cross-functional and cross-organizational teams to understand issues, implement solutions, and influence others to implement solutions
  • Able to be comfortable in a fast-paced environment with constantly shifting priorities and a mix of quick hit, short, medium and long-term projects of varying priorities
  • High-tech, or manufacturing industry experience preferred but not required

Excellent communication, facilitation, and interpersonal relations skills required

Additional Information

Western Digital thrives on the power and potential of diversity. As a global company, we believe the most effective way to embrace the diversity of our customers and communities is to mirror it from within. We believe the fusion of various perspectives results in the best outcomes for our employees, our company, our customers, and the world around us. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution.

Western Digital is committed to offering opportunities to applicants with disabilities and ensuring all candidates can successfully navigate our careers website and our hiring process. Please contact us at to advise us of your accommodation request. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.

*** Mention DataYoshi when applying ***

Offers you may like...


    Senior Data Analyst-Data Science Experimentation
  • Braintrust

    Data Scientist
    New York, NY 10001
  • Electra Vehicles

    Lead Data Scientist
    Boston, MA
  • Barclays

    Data Scientist

    Junior Data Scientist / Entry Level Data Scientist