Big Data Engineer – Health Systems at Discovery Limited
Don’t forget to Read: How To Make Money Online as a Students
Discovery Limited suitably invites qualified and experienced candidates for the position of Big Data Engineer – Health Systems to join our dynamic team.
Duties & Responsibilities
The successful applicant will be responsible for but not limited to the following job functions:
What you’ll do
- Be able to perform installations from scratch
- Handle all key server integration and HA responsibilities
- Setup the Kerberos dependencies and necessary configuration
- Create and manage users / groups
- Manage the security roles – AD as well as Sentry / Ranger and OS
- Understand all admin modules, ingestion pipeline as well as monitoring requirements
- Be able to perform troubleshooting e.g., identifying unhealthy / problematic modules in the context of an issue / problem resolution
- Handle partitioning and indexing requirements with the objective not to compromise performance
- Have excellent in-depth OS knowledge and handle any OS level scripting requirements
- Provide the necessary OS training / knowledge transfer to the Junior Administrators
- Mange the TCP proxy and any firewall port requirements
- Plan and provision new hardware as required.]
- Execute the BigData hybrid strategy e.g., seamless on premise and in the cloud cluster integration
Required Knowledge and Experience
We pride ourselves with having the best people, which are our most important assets. Our company has been recognized for having the highest ethics and strives for excellence through distinctly higher standards than the norm.
We therefore urge only candidates with these unique requirements and experience to apply for this stimulating position.
Education and Experience
- BSc Computer Science or equivalent 3-year IT qualification
- 3 years’ experience in a Chief Technology Office working on the Big Data environment
- Cloudera|Hortonworks|Linux (RedHat) Certification – advantageous
- Understanding (with some experience) highly distributed architecture.
- Working experience with Python including coding, Python with Spark, and ML frameworks
- Good understanding of the JVM and multithreading
- Good understanding of partitioning and sharding
- Good tooling experience with Chef, Ansible and cloud skills (e.g., AWS), MLOps / MLflow, NVIDIA CUDA, IfiniBand RDMA, splitting traffic between IP and InfiniBand stack and Cloudera CDP
- Understanding the infrastructure stack not limited to:
- Datacentre architecture and processes. Good understanding and experience.
- Racks, servers, wiring etc. Excellent understanding and some experience.
- Network – topology, VLANs, switches, routers etc. Good understanding.
- Firewalls – reasonable understanding.
- AD forest and LDAP
- Operating systems (specifically Linux) – expert understanding and experience
- Cloud infrastructure (preferably AWS) architecture, provisioning and operating
- Excellent level of knowledge and understanding about:
- HBase, HDFS, Spark, Dask, Map Reduce, Hive, Zookeeper, YARN, Kafka, NiFi, Atlas, Ranger, Impala, Airfolow, Arrow, Livy, Jupyter notebooks etc.
- Kerberos and PKI.
- TCP, UDP, multicast
- Kubernetes and Docker
- InfiniBand network topology and protocol.
Personal Attributes and Skills
- Technology savvy person with real hunger for acquiring technology knowledge
- Ability and willingness to do research and do remote courses
- Fast learner
- Learning is regarded as a hobby
- Really driven individual
- Ability to work under stress
Recommended: WEBSITES? ... CLICK HERE FOR MORE...