We are seeking a skilled Data Engineer to join our dynamic team. In this role, will be responsible for designing, implementing, and maintaining scalable data pipelines and infrastructure on AWS cloud platform. The ideal candidate will have extensive experience with AWS services, particularly in the realm of big data processing and analytics.
The role involves working closely with cross-functional teams to support data-driven decision-making and focus on delivering business objectives while improving efficiency and ensuring high service quality.
KEY RESPONSIBILITIES:
Design, develop, and maintain large-scale data pipelines that can handle large datasets from multiple sources.
Expertise in real-time data replication and batch processing of data using distributed computing platforms like Spark, Kafka, etc.
Optimize performance of data processing jobs and ensure system scalability and reliability.
Collaborate with DevOps teams to manage infrastructure, including cloud environments like AWS
Collaborate with data scientists, analysts, and business stakeholders to develop tools and platforms that enable advanced analytics and reporting.
Lead and mentor junior data engineers, providing guidance on best practices, code reviews, and technical solutions.
Evaluating and implementing new frameworks, tools for data engineering
Strong analytical and problem-solving skills with attention to detail.
To maintain a healthy working relationship with the business partners/users and other MLI departments
Responsible for overall performance, cost and delivery of technology solutions
Key Technical competencies/skills required
Hands-on experience with AWS services such as S3, DMS, Lambda, EMR, Glue, Redshift,RDS (Postgres) Athena, Kinesics, etc.
Expertise in data modelling and knowledge of modern file and table formats.
Expertise in data replication tool like Qlik replicate, AWS DMS
Proficiency in programming languages such as Python, PySpark, SQL/PLSQL for implementing data pipelines and ETL processes.
Experience data architecting or deploying Cloud/Virtualization solutions (Like Data Lake, EDW, Mart ) in enterprise
Knowledge of modern data stack and keeping the technology stack refreshed.
Knowledge of DevOps to perform CI/CD for data pipelines.
Knowledge of Data Observability, automated data lineage and metadata management would be an added advantage.
Cloud/hybrid cloud (preferable AWS) solution for data strategy for Data lake, BI and Analytics
Set-up logging, monitoring, alerting, dashboards for cloud solution and data solution
Experience with data warehousing concepts.
Desired qualifications and experience:
Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred).
Proven experience of 7 to 12 years as a Data Engineer or similar role with a strong focus on AWS cloud
Strong analytical and problem-solving skills with attention to detail.
Excellent communication and collaboration skills.
AWS certifications (e.g., AWS Certified Big Data - Specialty) are a plus