What you will do:
•Create and maintain optimal data pipeline architecture.
•Assemble large, complex data sets that meet functional / non-functional business requirements.
•Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
•Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources ‘big data’ technologies.
•Build analytics tools that utilize the data pipeline to provide operational efficiency and other key business performance metrics.
•Strong experience in Hadoop and Spark.
•Should have strong experience in:
•Writing HQLs / SQLs
•Performance optimization techniques in Spark and Hive
•Processing large volume of data
•Ability to understand business requirements and implement analytical solutions & techniques independently or with minimal supervision
•Should have experience in data modeling and data warehousing concepts.