About the job The world’s most critical–and at risk–business applications have
been neglected for far too long. Onapsis eliminates this blind spot by providing
cybersecurity solutions dedicated to business-critical applications. Whether
running on premises, in the cloud, or in a hybrid environment, Onapsis helps
nearly 30% of the Forbes Global 100 understand the threats and risks across
their SAP and Oracle landscapes. We are seeking a Data Engineer III to join our
mission-driven team. This role is ideal for experienced data engineers with a
proven track record in architecting scalable data pipelines, leveraging cloud
technologies, and contributing to high-impact cybersecurity solutions. You will
be responsible for building high-performance ETL frameworks, optimizing data
platforms, and contributing directly to the enhancement of our customers’ threat
detection, response, and remediation capabilities. What you will be doing, your
legacy: You will be working directly with company Principal Engineers
evaluating, scoping, proposing, and building features to fulfill business
solution requirements to protect our customers. You will play a direct role in
laying the technical foundation for a new product offering. Additionally, you
will be working with Engineering and DevOps to deliver high-quality products and
services while also working closely with security and IT professionals to ensure
safe and secure best practices are followed. Responsibilities: Architect and
Design Scalable Data Solutions: Design/develop/maintain Data lakehouse solutions
(Iceberg/Delta Lake /Hudi) applying industry best practices and structuring /
optimizing the data according to data access patterns. Data Pipeline
Development: Implement ETL/ELT pipelines using cloud technologies (Spark /
pySpark / Glue, Kinesis Streams / Iceberg) to load the data into a Lakehouse for
both efficient ML processing and UI reporting. Implement data models and data
processing frameworks (Spark, Kafka, Snowflake) to ingest, transform, and load
large datasets into Data Lakehouse techs (Apache Iceberg, Apache Delta Lake or
Apache Hudi), ensuring high availability and reliability of data. Advanced Data
Integration: Develop solutions that integrate multiple data sources into
Snowflake or similar data warehouses to enable real-time analytics and reporting
across dashboards. AI/ML Integration: Collaborate with cross-functional teams to
co-develop AI-driven features identifying patterns and anomalies in client data
using AI/ML technologies (python). Compliance and Security: Ensure compliance
with industry standards and secure best practices (SOX, SOC 1/2), by
implementing data governance frameworks, monitoring data pipelines, and
optimizing cloud database architectures to protect sensitive information.
Stakeholder Collaboration: Work closely with stakeholders, including analysts,
engineers, and product managers, to understand their data needs, propose
solutions, and drive data-driven decision-making by delivering actionable
insights. Data Infrastructure Monitoring: Continuously monitor, troubleshoot,
and enhance data pipelines, leveraging CI/CD tools (Docker, Jenkins, GitHub
Actions) and orchestrating workflows using Apache Airflow to maintain
operational efficiency. Leadership and Mentorship: Provide hands-on mentorship
and technical guidance to junior engineers, including code reviews and
architecture discussions. Documentation and Governance: Establish comprehensive
documentation for data architecture, governance, and processes to ensure
scalability, compliance, and security. Qualifications: 3+ years of proven
experience as a Data Engineer or in a similar role with a deep understanding of
data architecture and cloud-based ETL/ELT frameworks. Strong experience with AWS
(preferably) or Azure , particularly with Glue, EMR, S3, Lambda. Databricks ,
Snowflake , Synapse experience is a bonus. Proficiency in big data technologies
such as Apache Spark, Kafka, Hadoop, and Databricks for distributed data
processing. Proficiency with Python libraries for data processing and ML (e.g.,
Pandas, NumPy, Polars, Scikit-learn, PyTorch, TensorFlow). Hands-on experience
in building real-time data processing and AI/ML-driven analytics solutions
(SageMaker, Bedrock, NLP, Power BI). Ability to architect and manage data
lakehouse solutions (Iceberg / Delta Lake / Hudi) or classic warehouse solutions
(Redshift , Snowflake). Familiarity with compliance and audit requirements (SOX,
SOC 1/2, GDPR) and implementing data governance and security frameworks. Strong
problem-solving skills with a focus on data integrity, scalability, and
performance optimization. Experience with CI/CD tools (Jenkins, GitHub Actions,
Docker) and data orchestration platforms (Apache Airflow). Preferred
Qualifications: Experience with advanced data architecture principles (medallion
architecture, materialized views, task scheduling). Experience using BI tools
(e.g., Power BI, Tableau) for real-time analytics and operational reporting What
we offer: A role in shaping the future of protecting the most critical
applications that run the world’s business and a career that grows as the
company grows. A unique culture of high achievement and teamwork. Supportive and
humble colleagues are the space’s top problem solvers and innovators. Financial
security through competitive compensation and incentives. Location: Onapsis is
establishing a new development center in Bucharest. This is a hybrid role (1-2
days per week from the office), so candidates must be commutable to Bucharest
every week. About Onapsis: Onapsis protects the business applications that run
the global economy. The Onapsis Platform delivers vulnerability management,
change assurance, and continuous compliance for business applications from
leading vendors such as SAP, Oracle, and others. The Onapsis Platform is powered
by the Onapsis Research Labs, the team responsible for the discovery and
mitigation of more than 1,000 zero-day vulnerabilities in business applications.
Onapsis is headquartered in Boston, MA, with offices in Heidelberg, Germany and
Buenos Aires, Argentina, and proudly serves hundreds of the world’s leading
brands, including close to 30% of the Forbes Global 100, six of the top 10
automotive companies, five of the top 10 chemical companies, four of the top 10
technology companies, and three of the top 10 oil and gas companies. For more
information, connect with Onapsis on LinkedIn or visit https://www.onapsis.com.
#LI-AC1 #LI-Hybrid
Hiring organization:
Other vacancies