Position : Data Scientist/Data Engineer(Independent Contractors Only)
Data Scientist/Data Engineer/Data Analyst
Remote
Contract
[PySPARK, Oracle/ SQL, Linux, Hadoop Cloudera, IBM Spectrum Conductor, IBM Spectrum Scale]
This role is a mix between a Data Analyst and Data Engineer
- They will be responsible to work on the bank’s regulatory data processes within applications
- These applications capture the data and are used for earnings, reporting, auditing, etc. This cycle runs every quarter
The applications layer is built prominently in python
- Python is used for processing and time series (Pandas will be used)
- They will not be doing research type work, but on a production scale
- Previous experience of working with Python and/or R and Sass in this production capacity. There is SASS coding, that is sent to python and consumed through Sparky, then code is converted into PySpark, etc.
- They use your typical big data technologies (not most, but like Hadoop), and they use a big data space with, for example, file sets that are billions of rules
- There is SASS coding, that is sent to python and consumed through Sparky
- The next step is to introduce converting the code into PySpark
A lot of plumbing work getting data from one side to another
- They will be working in different cloud formats so having software engineer experience is beneficial
Data worked with is confidential and classified nonpublic information
- Previously working in that field like is always a preferred thing
Mandatory Skills needed in beeline:
- Linux
- Hadoop Cloudera
- IBM Spectrum Conductor
- IBM Spectrum Scale
- Python
Top Must-Have Skills:
- Python-Pandas required
- IBM Spectrum Conductor & Spectrum Scale
- Linux Expert level
- Hadoop and Cloud Platforms
- Large data center environments
Candidate Requirements:
- Pyspark
- Oracle/ SQL
- Technical skills
- Strong communication
- Analytical