info@digest.pro

Subscribe to our Digest

    I`m interested in (select one or more)

    HomeVacanciesData Engineer III

    Data Engineer III

     

    About the job The world’s most critical–and at risk–business applications have
    been neglected for far too long. Onapsis eliminates this blind spot by providing
    cybersecurity solutions dedicated to business-critical applications. Whether
    running on premises, in the cloud, or in a hybrid environment, Onapsis helps
    nearly 30% of the Forbes Global 100 understand the threats and risks across
    their SAP and Oracle landscapes. We are seeking a Data Engineer III to join our
    mission-driven team. This role is ideal for experienced data engineers with a
    proven track record in architecting scalable data pipelines, leveraging cloud
    technologies, and contributing to high-impact cybersecurity solutions. You will
    be responsible for building high-performance ETL frameworks, optimizing data
    platforms, and contributing directly to the enhancement of our customers’ threat
    detection, response, and remediation capabilities. What you will be doing, your
    legacy:  You will be working directly with company Principal Engineers
    evaluating, scoping, proposing, and building features to fulfill business
    solution requirements to protect our customers. You will play a direct role in
    laying the technical foundation for a new product offering. Additionally, you
    will be working with Engineering and DevOps to deliver high-quality products and
    services while also working closely with security and IT professionals to ensure
    safe and secure best practices are followed.  Responsibilities: Architect and
    Design Scalable Data Solutions: Design/develop/maintain Data lakehouse solutions
    (Iceberg/Delta Lake /Hudi) applying industry best practices and structuring /
    optimizing the data according to data access patterns. Data Pipeline
    Development: Implement ETL/ELT pipelines using cloud technologies (Spark /
    pySpark / Glue, Kinesis Streams / Iceberg) to load the data into a Lakehouse for
    both efficient ML processing and UI reporting. Implement data models and data
    processing frameworks (Spark, Kafka, Snowflake) to ingest, transform, and load
    large datasets into Data Lakehouse techs (Apache Iceberg, Apache Delta Lake or
    Apache Hudi), ensuring high availability and reliability of data. Advanced Data
    Integration: Develop solutions that integrate multiple data sources into
    Snowflake or similar data warehouses to enable real-time analytics and reporting
    across dashboards. AI/ML Integration: Collaborate with cross-functional teams to
    co-develop AI-driven features identifying patterns and anomalies in client data
    using AI/ML technologies (python). Compliance and Security: Ensure compliance
    with industry standards and secure best practices (SOX, SOC 1/2), by
    implementing data governance frameworks, monitoring data pipelines, and
    optimizing cloud database architectures to protect sensitive information.
    Stakeholder Collaboration: Work closely with stakeholders, including analysts,
    engineers, and product managers, to understand their data needs, propose
    solutions, and drive data-driven decision-making by delivering actionable
    insights. Data Infrastructure Monitoring: Continuously monitor, troubleshoot,
    and enhance data pipelines, leveraging CI/CD tools (Docker, Jenkins, GitHub
    Actions) and orchestrating workflows using Apache Airflow to maintain
    operational efficiency. Leadership and Mentorship: Provide hands-on mentorship
    and technical guidance to junior engineers, including code reviews and
    architecture discussions. Documentation and Governance: Establish comprehensive
    documentation for data architecture, governance, and processes to ensure
    scalability, compliance, and security. Qualifications: 3+ years of proven
    experience as a Data Engineer or in a similar role with a deep understanding of
    data architecture and cloud-based ETL/ELT frameworks. Strong experience with AWS
    (preferably) or Azure , particularly with Glue, EMR, S3, Lambda. Databricks ,
    Snowflake , Synapse experience is a bonus. Proficiency in big data technologies
    such as Apache Spark, Kafka, Hadoop, and Databricks for distributed data
    processing. Proficiency with Python libraries for data processing and ML (e.g.,
    Pandas, NumPy, Polars, Scikit-learn, PyTorch, TensorFlow). Hands-on experience
    in building real-time data processing and AI/ML-driven analytics solutions
    (SageMaker, Bedrock, NLP, Power BI). Ability to architect and manage data
    lakehouse solutions (Iceberg / Delta Lake / Hudi) or classic warehouse solutions
    (Redshift , Snowflake). Familiarity with compliance and audit requirements (SOX,
    SOC 1/2, GDPR) and implementing data governance and security frameworks. Strong
    problem-solving skills with a focus on data integrity, scalability, and
    performance optimization. Experience with CI/CD tools (Jenkins, GitHub Actions,
    Docker) and data orchestration platforms (Apache Airflow). Preferred
    Qualifications: Experience with advanced data architecture principles (medallion
    architecture, materialized views, task scheduling). Experience using BI tools
    (e.g., Power BI, Tableau) for real-time analytics and operational reporting What
    we offer:  A role in shaping the future of protecting the most critical
    applications that run the world’s business and a career that grows as the
    company grows. A unique culture of high achievement and teamwork. Supportive and
    humble colleagues are the space’s top problem solvers and innovators. Financial
    security through competitive compensation and incentives. Location: Onapsis is
    establishing a new development center in Bucharest. This is a hybrid role (1-2
    days per week from the office), so candidates must be commutable to Bucharest
    every week.  About Onapsis: Onapsis protects the business applications that run
    the global economy. The Onapsis Platform delivers vulnerability management,
    change assurance, and continuous compliance for business applications from
    leading vendors such as SAP, Oracle, and others. The Onapsis Platform is powered
    by the Onapsis Research Labs, the team responsible for the discovery and
    mitigation of more than 1,000 zero-day vulnerabilities in business applications.
    Onapsis is headquartered in Boston, MA, with offices in Heidelberg, Germany and
    Buenos Aires, Argentina, and proudly serves hundreds of the world’s leading
    brands, including close to 30% of the Forbes Global 100, six of the top 10
    automotive companies, five of the top 10 chemical companies, four of the top 10
    technology companies, and three of the top 10 oil and gas companies. For more
    information, connect with Onapsis on LinkedIn or visit https://www.onapsis.com.
    #LI-AC1 #LI-Hybrid