Senior Data Engineer (EM-12291)
Develop and maintain data pipelines and ETL (Extract, Transform, Load) processes
Develop and maintain data pipelines and ETL (Extract, Transform, Load) processes
Work with structured and unstructured data to ensure it is accessible and usable
Optimize data systems for performance and scalability
Implement data quality and data governance standards
Collaborate with stakeholders across technology and business units to understand their data needs and translate them into technical solutions, providing data-driven insights
Contribute to the documentation and knowledge sharing within the team by creating and maintaining technical documentation and training materials
Participate in code reviews and contribute to the improvement of development processes
Contribute to the broader data architecture community through knowledge sharing and presentations
Work with structured and unstructured data to ensure it is accessible and usable
Optimize data systems for performance and scalability
Implement data quality and data governance standards
Collaborate with stakeholders across technology and business units to understand their data needs and translate them into technical solutions, providing data-driven insights
Contribute to the documentation and knowledge sharing within the team by creating and maintaining technical documentation and training materials
Participate in code reviews and contribute to the improvement of development processes
Contribute to the broader data architecture community through knowledge sharing and presentations
Requirements:
8+ years of experience in data engineering or a related field
8+ years of experience in data engineering or a related field
Proficiency in Python
Experience with data processing frameworks such as Apache Spark or Hadoop
Knowledge of database systems (SQL and NoSQL)
Experience working with Snowflake and Databricks
Familiarity with cloud platforms (AWS, Azure) and their data services
Understanding of data modeling and data architecture principles
Experience with data warehousing concepts and technologies
Experience with message queues and streaming platforms (e.g., Kafka)
Experience with version control systems (e.g., Git)
Experience using Jupyter notebooks for data exploration, analysis, and visualization
Excellent communication and collaboration skills
Ability to work independently and as part of a geographically distributed team
Proficiency in Python
Experience with data processing frameworks such as Apache Spark or Hadoop
Knowledge of database systems (SQL and NoSQL)
Experience working with Snowflake and Databricks
Familiarity with cloud platforms (AWS, Azure) and their data services
Understanding of data modeling and data architecture principles
Experience with data warehousing concepts and technologies
Experience with message queues and streaming platforms (e.g., Kafka)
Experience with version control systems (e.g., Git)
Experience using Jupyter notebooks for data exploration, analysis, and visualization
Excellent communication and collaboration skills
Ability to work independently and as part of a geographically distributed team
Advantages:
Familiarity with data visualization tools (e.g., Tableau, Power BI)
Knowledge of data governance and security best practices (e.g., data access control, data masking)
Experience with Agile methodologies
Familiarity with data catalog and metadata management tools (e.g., Collibra)
Familiarity with CI/CD pipelines and DevOps practices
Design and develop scalable, reliable ETL processes using Python and DataBricks
Build and maintain data pipelines for extracting, transforming, and loading data from various sources
Take ownership of the full data engineering lifecycle, from extraction to transformation and loading
Optimize data workflows, ensuring robust error handling, monitoring, and performance tuning
Work within an agile environment, actively participating in sprint planning, stand-ups, and retrospectives
Conduct code reviews and maintain high coding standards
Develop tooling and automation scripts to improve operational efficiency
Implement comprehensive testing for data pipelines, including unit and integration tests
Integrate data sources via REST APIs and other techniques
Maintain up-to-date technical documentation and data flow diagrams
Requirements
Proficient in Python, with a focus on clean, efficient, and maintainable code
Hands-on experience with DataBricks and cloud-based data engineering tools
Skilled in Snowflake or other cloud data warehousing platforms
Solid understanding of ETL principles, data modeling, and integration best practices
Comfortable working in agile, fast-paced, collaborative environments
Experienced with Git and version control systems
Detail-oriented with a strong problem-solving mindset
Familiar with Linux systems and REST API integrations
Advantages
Experience with data visualization tools (such as Power BI)
Knowledge of data orchestration tools like Apache Airflow
Exposure to big data frameworks including Hadoop or Spark
Background in database administration or performance optimization