Required Skills and Experience:
Minimum 5 years of progressive Data Architecture experience
Minimum 3 years of experience working with Big Data technologies
Experience in and strong knowledge of Big Data ecosystem including Hadoop HDFS, Apache Spark, Sqoop, Flume, and Map/Reduce
Must have worked on building a robust big data architecture
Strong SQL knowledge and ability to write and understand complex SQL
Proven experience ingesting data from multiple data sources such as REST API, SFTP flat files, Streaming data etc.
Proven experience in creating data pipelines using tools such as Apache AirFlow
Proven experience with Big Data querying tools such as Athena/Presto, Pig, Hive, and Impala
Must have strong understanding of data lineage, data provenance and high-fidelity data concepts
Proven experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O
Proven experience in creating and managing data catalogs, metadata management, and semantic taxonomy management
Troubleshooting and performance tuning experience with big data technology stack will be a plus
Applied knowledge of data modeling principles
Strong understanding of Big Data Technologies running on Cloud
Strong understanding of RDBMS systems and SQL querying techniques
Strong experience with using Python scripts & libraries
Proven experience in many of the following:Enterprise Data Warehouse design experience
Experience in at least one of the Data modeling tools such as Erwin, Power Designer etc.
Experience with industry standard ETL Tools
Experience creating logical and physical data models based on the business reporting requirements
Experience in at least one RDBMS - Teradata, Oracle, SQL Server etc.
Experience with software engineering tools and workflows using tools such as Jenkins, Git and TFS
Experience with data governance processes and structure
Experience desired with Database Warehousing Design Concepts; Dimensional.
* Modeling, Star/Snowflake Schemas, ETL/ELT, Data Marts, Analytic Playgrounds
Proven experience with team collaboration, release management, system and performance monitoring