Digital Twin Project
With innovation at the core of our business model and corporate values, Kruger Products and Kruger Inc., have been at the forefront of the transformation of the pulp and paper industry in Canada, implementing ground-breaking AI capabilities and promoting its digital transformation. To support this journey, we are building an AI focused team, with positions such as Data Scientist, Software Engineer, and Data Engineer. You’ll play an active role in the Digital Twin Project that involves developing a virtual model of the Plant’s entire supply chain and using real-time data augmented with AI capabilities to improve the Plant’s overall performance.
As a Data Engineer, AI you will play a pivotal role in building and operationalizing our analytics and AI initiatives. You will be tasked with working with key business stakeholders, IT experts and subject-matter experts to plan and deliver optimal analytics and data science solutions.
Additionally, you will also be expected to collaborate with data scientists, data analysts and other data consumers and work on the models and algorithms to optimize them for data quality, security and governance and put them into production leading to potentially large productivity gains.
• Build data pipelines: These data pipelines must be created, maintained and optimized as workloads move from development to production for specific use cases. Architecting, creating and maintaining data pipelines will be the primary responsibility of the data engineer.
• Drive automation using innovative and modern tools, techniques and architecture to partially or completely automate the most-common, repeatable and tedious data preparation and integration task in order to minimize manual and error-prone processes and improve productivity
• Educate and train. You should be curious and knowledgeable about new data initiatives and how to address them. You will also be responsible for proposing appropriate (and innovative) data ingestion, preparation, integration and operationalization techniques in optimally addressing these data requirements. The data engineer will be required to train counterparts such as (data scientists, data analysts, business users or any data consumers) in these data pipelining and preparation techniques, which make it easier for them to integrate and consume the data they need for their own use cases.
• Participate in ensuring compliance and governance during data use. You will be responsible to ensure that the data users and consumers use the data provisioned to them responsibly through data governance and compliance initiatives.
• Promote reusability, scalability and performance while building our modern data platform.
• A bachelor's or master's degree in computer science, statistics, applied mathematics, data management, information systems, information science or a related quantitative field [or equivalent work experience] is required.
• An advanced degree in computer science (MS), statistics, applied mathematics (Ph.D.), information science (MIS), data management, information systems, information science (postgraduation diploma or related) or a related quantitative field [or equivalent work experience] is preferred.
• At least 3 years of relevant work experience in data management disciplines including data integration, modeling, optimization and data quality, and/or other areas directly relevant to data engineering responsibilities and tasks
• At least 2 years of experience working in cross-functional teams and collaborating with business stakeholders in support of a departmental and/or multi-departmental data management and analytics initiative.
• Experience in master data management, custom data processing algorithm development, big data analytic process development, distributed data processing and clear understanding of different data domains
• Experience in developing applications in high volume data staging / ETL environments
• Experience in agile and CI/CD-based software development and delivery
• Experience in developing and maintaining formal documentation describing data and data structures including data modelling
• Experience in extensive data programming use cases with Python.
• Experience with those languages or software considered a plus:
o Other languages a plus, e.g., R, C/C++, Scala, Java
o Kafka, NO-SQL, SPARK, HDFS frameworks for analytically driven business use cases
o Advanced Linux, SQL skills
o Cloudera big data platform
o Any-Logic or other data process simulation framework
SKILLS AND ABILITIES
• Data visualization frameworks using Python / libraries / other
• Team oriented and flexible with proven track record in collaborating with multiple stakeholders
• Ability to quickly learn new technologies
• Strong attention to detail and ability to think critically and conceptually
• Strong interpersonal skills, strong verbal and written communication skills
• Strong ability to multi-task in a dynamic, fast-paced environment
• Good verbal and written communication skills, both in English and French