📍 Work Location: Remote
💼 Experience:
• Data Engineer – 4 to 6 Years
• Senior Data Engineer – 6 to 8 Years
🗣 Requirement: Strong communication skills (client onboarding exposure preferred)
🛠️ Key Responsibilities
⚙️ Data Engineering & Pipeline Development
– Design and develop scalable data pipelines and ETL workflows for large datasets.
– Build efficient data models, schemas, and transformation logic for analytics and reporting.
– Manage and optimize data lakes, warehouses, and lakehouse architectures.
☁️ Big Data & Cloud Platforms
– Develop and maintain data processing systems using Spark, Delta Lake, Kafka, and related technologies.
– Build cloud-based data solutions using Azure, AWS, or GCP.
– Ensure solutions are scalable, secure, and cost-efficient.
📊 Data Quality & Governance
– Implement data validation, monitoring, and quality frameworks.
– Maintain compliance with security, access control, and data privacy standards.
– Support adoption of modern data exchange standards such as FHIR where applicable.
🔄 Real-Time Data Processing
– Develop streaming data pipelines using technologies like Apache Kafka or event streaming platforms.
– Ensure real-time data availability for analytics and downstream applications.
🤝 Collaboration
– Work closely with data scientists, analysts, and engineering teams to deliver reliable data solutions.
– Contribute to continuous improvement of data architecture and engineering best practices.
🎓 Required Skills & Qualifications
– Strong experience with SQL, Python, and distributed data processing frameworks
– Hands-on experience with Spark, Delta Lake, Kafka, Presto, and big data tools
– Experience with cloud platforms such as Azure, AWS, or GCP
– Experience with Docker, Kubernetes, and scalable distributed systems
– Experience building data pipelines, ETL processes, and scalable data models
– Familiarity with data governance, data cataloging, and master data management
⭐ Preferred Experience
– Experience with Databricks or Microsoft Fabric
– Exposure to healthcare datasets or regulated data environments
– Experience with real-time streaming architectures
– Familiarity with modern data warehousing tools such as Snowflake, Redshift, BigQuery, or Synapse
– Exposure to Nextflow or workflow orchestration platforms
📩 To Apply: Share your CV
at: rashmita.r@mitrhr.com