Big Data Engineer – Health Systems at Discovery Limited

Don’t forget to Read: How To Make Money Online as a Students


Job Description

Discovery Limited suitably invites qualified and experienced candidates for the position of Big Data Engineer – Health Systems to join our dynamic team.

Duties & Responsibilities

The successful applicant will be responsible for but not limited to the following job functions:

What you’ll do

  • Be able to perform installations from scratch
  • Handle all key server integration and HA responsibilities
  • Setup the Kerberos dependencies and necessary configuration
  • Create and manage users / groups
  • Manage the security roles – AD as well as Sentry / Ranger and OS
  • Understand all admin modules, ingestion pipeline as well as monitoring requirements
  • Be able to perform troubleshooting e.g., identifying unhealthy / problematic modules in the context of an issue / problem resolution
  • Handle partitioning and indexing requirements with the objective not to compromise performance
  • Have excellent in-depth OS knowledge and handle any OS level scripting requirements
  • Provide the necessary OS training / knowledge transfer to the Junior Administrators
  • Mange the TCP proxy and any firewall port requirements
  • Plan and provision new hardware as required.]
  • Execute the BigData hybrid strategy e.g., seamless on premise and in the cloud cluster integration

 Required Knowledge and Experience

We pride ourselves with having the best people, which are our most important assets. Our company has been recognized for having the highest ethics and strives for excellence through distinctly higher standards than the norm.

We therefore urge only candidates with these unique requirements and experience to apply for this stimulating position.

Education and Experience

  • BSc Computer Science or equivalent 3-year IT qualification
  • 3 years’ experience in a Chief Technology Office working on the Big Data environment
  • Cloudera|Hortonworks|Linux (RedHat) Certification – advantageous
  • Understanding (with some experience) highly distributed architecture.
  • Working experience with Python including coding, Python with Spark, and ML frameworks
  • Good understanding of the JVM and multithreading
  • Good understanding of partitioning and sharding
  • Good tooling experience with Chef, Ansible and cloud skills (e.g., AWS), MLOps / MLflow, NVIDIA CUDA, IfiniBand RDMA, splitting traffic between IP and InfiniBand stack and Cloudera CDP
  • Understanding the infrastructure stack not limited to:
  1. Datacentre architecture and processes. Good understanding and experience.
  2. Racks, servers, wiring etc. Excellent understanding and some experience.
  3. Network – topology, VLANs, switches, routers etc. Good understanding.
  4. Firewalls – reasonable understanding.
  5. AD forest and LDAP
  • Operating systems (specifically Linux) – expert understanding and experience
  • Cloud infrastructure (preferably AWS) architecture, provisioning and operating
  • Excellent level of knowledge and understanding about:
  1. HBase, HDFS, Spark, Dask, Map Reduce, Hive, Zookeeper, YARN, Kafka, NiFi, Atlas, Ranger, Impala, Airfolow, Arrow, Livy, Jupyter notebooks etc.
  2. Kerberos and PKI.
  3. TCP, UDP, multicast
  4. Kubernetes and Docker
  5. InfiniBand network topology and protocol.

Personal Attributes and Skills

  • Technology savvy person with real hunger for acquiring technology knowledge
  • Ability and willingness to do research and do remote courses
  • Fast learner
  • Learning is regarded as a hobby
  • Really driven individual
  • Ability to work under stress
  • Self-starter
Interested applicants should Click here to apply


Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button
%d bloggers like this:

Adblock Detected

Please consider supporting us by disabling your ad blocker