Data Engineer – Agentic AI & Large-Scale Data - (CONTRACT), Cape Town
-
0000 Cape Town, South Africa
-
Posted: less than a month ago
-
Save
Role Summary
Our client is looking for a strong Data Engineer with hands-on experience in data platforms, large-scale scraping, LLMs, and agentic AI solutions.
This person should be able to build reliable data pipelines and also help design AI agents that can use tools, query systems, process data, and automate complex workflows.
Key Responsibilities
- Build scalable data pipelines for batch and near-real-time ingestion.
- Ingest data from APIs, files, databases, event streams, and web sources.
- Design and operate large-scale scraping and data acquisition solutions.
- Clean, validate, transform, and model data for analytics, reporting, and AI use cases.
- Build LLM-powered workflows and agentic solutions using tool calling, structured outputs, RAG, and API/database integration.
- Support data lake, warehouse, and lakehouse architectures.
- Implement data quality checks, schema validation, monitoring, and observability.
- Work with engineering teams to build production-ready, secure, and maintainable solutions.
Required Experience
- Strong Python and/or Node.js/TypeScript experience.
- Strong SQL and relational database experience.
- Proven experience building production-grade data pipelines.
- Experience with AWS or similar cloud platforms.
- Experience with data lakes, warehouses, or lakehouse technologies.
- Practical experience with LLM APIs such as OpenAI, Anthropic, Gemini, Bedrock, or similar.
- Experience with agentic patterns such as tool use, task decomposition, retrieval, memory, and human-in-the-loop workflows.
- Experience with scraping tools such as Playwright, Puppeteer, Scrapy, or Selenium.
- Good understanding of APIs, CI/CD, containers, testing, and secure engineering practices.
Desirable Experience
- Snowflake, Trino, Athena, Iceberg, Delta Lake, Databricks, or BigQuery.
- Kafka, Kinesis, SQS, SNS, Airflow, Dagster, Prefect, or Temporal.
- Vector databases such as pgvector, Pinecone, Weaviate, Qdrant, or OpenSearch.
- LangChain, LangGraph, LlamaIndex, CrewAI, AutoGen, Semantic Kernel, or MCP.
- Experience with data governance, PII handling, encryption, auditability, and compliance.
Candidate Profile
- The ideal candidate is a practical builder who can turn ambiguous requirements into working systems.
- They should be comfortable across data engineering, scraping, automation, and applied AI — with a focus on reliable production solutions rather than demos.
Example Projects
- Build scalable scraping pipelines for public market and supplier data.
- Create AI agents that extract, validate, and structure data from approved sources.
- Build data pipelines into a lakehouse or warehouse.
- Develop RAG and LLM-powered assistants over business data.
- Automate manual research and data preparation workflows.
Success Measures
- Reliable data pipelines in production.
- High-quality structured data available for analytics and AI.
- Useful and controlled agentic workflows.
- Reduced manual data collection and preparation effort.
- Strong engineering standards across code, documentation, and operations.
Kindly regard your application as unsuccessful if you have not heard from the agency within 2 weeks.
-
Company nameProject Management Connection
-
Job positionData Engineer – Agentic AI & Large-Scale Data - (CONTRACT)
Data Engineer – Agentic AI & Large-Scale Data - (CONTRACT) has been posted in the Cape Town Information Technology category on Locanto.
Why not check out other ads in this category, such as Senior Systems Engineer, Cape Town, IT Technical Support Facilitator, Cape Town or Intermediate Data Engineer in Cape Town. In total, we have 345 ads in Information Technology in Cape Town on Locanto classifieds.
There are more ads within a 15 km radius for this category. If you want to view those ads, click here.