Data Engineer, Analytics
Location | Kampala, Uganda |
Date Posted | June 12, 2025 |
Category | Banking Engineering IT / Information Technology |
Job Type | Full-time |
Currency | UGX |
Description

JOB DETAILS:
Job Purpose Statement
The Data Engineering team is responsible for documenting data models, architecting distributed systems, creating reliable data pipelines, combining data sources, architecting data stores, and collaborating with the data science teams to build the right solutions for them.
They do this by using the Open Source Big Data platforms such as Apache NIFI, Kafka, Hadoop, Apache Spark, Apache Hive, HBase, Druid and the Java programming language, while picking the right tool for each purpose.
The growth of every product relies heavily on data, such as for scoring and for studying product behavior that may be used for improvement, and it is the role of the data engineer to build a fast and horizontally scalable architectures using modern tools that are not the traditional Business Intelligence systems as we know them.
Key Accountabilities (Duties and Responsibilities)
- Documenting Data Models (10%): The role will be responsible for documenting the entire journey that data elements take end-to-end, from the data sources to the all the data stores, including all thetransformations in between, and maintaining those documents up to date with every change.
- Architecting Distributed Systems (10%): Modern data engineering platforms are distributed systems. The data engineer designs the right architecture for each solution, while utilizing best-of-breed Open Source tools in the big data ecosystem because there is no one solution that does everything; the tools are specialized and are made lean and fit for purpose. The architecture should be one that can process any data, Any Time, Any Where, Any Workload.
- Combining Data Sources (10%): pulling data from different sources, which could be structured, semi-structured or unstructured data using tools such as Apache NIFI and taking the data through a journey that will create a final state that is useful to the data consumers. These sources can be REST, JDBC, Twitter, JMS, Images, PDF, MS Word and put the data into a staging environment such as Kafka topics for onward processing.
- Developing Data Pipelines (40%): creating data pipelines that will transform data using tools such as Apache Spark and the Java programming language. The pipelines may apply processing such as machine learning, aggregation, iterative computation, and so on.
- Architecting Data Stores (15%): Designing and creating data stores using big data platforms such as Hadoop, and the NoSQL databases such as HBase.
- Data Query and Analysis (10%): Utilizing tools such as Apache Hive to analyze data in the data stores to generate business insights.
- Team Leadership (5%): Providing team leadership to the data engineers.
Job Specifications
- A Bachelor’s degree in Computer Science
• Minimum 5 years’ experience developing object oriented applications using the Java programming language
• Certification and experience implementing best practice frameworks e.g. ITIL, PRINCE2, preferred
• Minimum 5 years' experience working with relational databases
• Minimum 5 years' experience working with the Linux operating system
• Experience with Open Source Big Data Platforms and tools (Hadoop, Kafka, Apache NIFI,
Apache Spark, Apache Hive, NoSQL databases) and ODI
• Experience working with Data Warehouses
• Experience with DevOps, Agile working and CICD
• Familiarity with complex systems integrations using SOA tools (Oracle Weblogic/ESB/SOA)
• Familiarity with industry standard formats and protocols (JMS, SOAP, XML/XPath/XQuery, REST and JSON) and data sources
• Excellent analytical, problem solving and reporting skills
• A good knowledge of the systems and processes within Financial Services industry
Job Dimensions
Combining Data Sources
• Ingesting data from various data stores
• Cleaning the data
• Combining the data
• Staging the data into topics
Creating Data Pipelines
• Picking ingested data
• Applying transformations and aggregation
• Performing computing on the data by aggregating, applying machine learning, predictive models
• Storing data in data stores
Architecting Data Stores:
• Designing and deploying fast, responsive data stores
• Performance tuning