Building robust, scalable and maintainable data pipelines is a cornerstone of any successful AI application. At The Agile Monkeys, we’re constantly exploring the best tools for the job. We recently undertook a project to migrate an existing Python-based data pipeline (one that powers AI Findr, our AI search solution) to Apache NiFi.
The goal was to evaluate whether NiFi would allow us to generate workflows in a fast, robust, and maintainable way, while remaining flexible enough for easy adoption of new clients. Specifically, the pipeline had to:
Ingest JSON files containing fashion product information from an e-commerce platform.
Transform this data by enriching it and creating two distinct embeddings from product details and images.
Load the processed output into Weaviate Cloud collections in the exact format required by our search engine.
After spending considerable time with NiFi, we’ve gathered our thoughts on where it shines and where it falls short. Here’s our take.
The Good: Where NiFi shines
Apache NiFi is a powerful tool with some undeniable strengths, especially for creating and managing data flows.
Out of the box flexibility: With hundreds of built-in processors, NiFi makes it remarkably easy to adapt your pipeline. For instance, switching from a file-based input to a Kafka stream can be done with minimal fuss, offering excellent long-term flexibility.
Purpose-built for data pipelines: NiFi excels at its core competency: creating, maintaining, and monitoring data pipelines, particularly those that are streaming-based. Its entire design is centered on this purpose.
Excellent observability: Once a pipeline is running, NiFi provides fantastic tools to keep an eye on it. You get detailed stats, powerful data provenance to track the lineage of every piece of data, and the ability to switch parameter contexts on the fly.
The Not-so-good: The rough edges
Despite its power, our experience with NiFi was not without friction. The learning curve was steeper than anticipated, and several aspects of the tool felt counter-intuitive.
- A steep learning curve: Some simple tasks can feel harder than they should. To issue a
POST
request, for example, the payload is taken directly from the FlowFile content, forcing awkward “save‑transform‑restore” hoops. We found this inefficient and overly complex for such a common workflow.
Dated user experience: The user experience feels like a step back in time. The UI refreshes automatically every 30 seconds, some important menus are difficult to reach, and most shockingly, there is no option to undo. This can make development frustrating.
Complex error handling: While NiFi provides the building blocks for retries and error handling, implementing them idiomatically is far from straightforward. Common solutions involve dumping errored FlowFiles to disk and creating a separate flow to re-process them. This works, but it feels like a workaround for what should be a core, built-in feature.
It’s not an orchestrator: It’s crucial to understand that while NiFi can manage data flow, it is not a true orchestrator. Complex dependencies and conditional workflows are better (and more easily) handled by dedicated orchestration tools like Apache Airflow or Prefect.
Code is still king for complexity: The most intricate steps still demanded custom code or even full-fledged Java projects. A case in point is our data‑enrichment stage, which applies dozens of business rules. We ended up writing a custom NiFi processor backed by its own Java module.
The infrastructure gap & the AI question
Two broader points stood out during our evaluation, one about infrastructure and the other about the future of UI-based tools.
On the infrastructure front, it’s surprising that there isn’t a major managed NiFi service from AWS, Google Cloud, or Azure. Most production-ready, containerized deployments must be custom-built, adding an operational overhead that many teams will want to avoid.
More philosophically, we began to question the role of UI-based tools in the age of AI coding assistants. Five years ago, a tool like NiFi made much more sense. Today, we wonder if it still holds the same appeal for two main reasons:
AI models still struggle to create or debug visual coding tools like NiFi pipelines.
The utility of many built-in processors can now be replicated in minutes by an AI assistant, directly in your language of choice.
Conclusions
Apache NiFi is a mature and powerful tool for data flow management. The idea behind it is compelling: a visual coding platform with hundreds of built-in components to build data pipelines easily. However, once you dive deeper and encounter its steep learning curve, dated UX, and convoluted patterns, you realize it’s not as straightforward as one might first imagine.
The experience can be disappointing, and it raises the question of whether the investment in learning NiFi’s specific ways is worth it. For teams that push through this big friction, the reward is a set of undeniably powerful tools. Whether this trade-off makes sense will depend on your team. A good example where it will pay off is when a technical team can handle the complex implementation, but then a non-technical team will handle monitoring or even maintenance. They will greatly benefit from the visual nature of NiFi and the ability to make changes live, with no deployments involved.
For our team at The Agile Monkeys, this model didn’t fit. We keep exploring alternatives to be able to build and adapt data pipelines for our clients in a simple and flexible way. As an example, our latest approach involved a microservice-oriented architecture orchestrated with Apache Airflow. This approach felt more natural to the team and, as a result, made implementation and maintenance simpler.
The Agile Monkeys
At The Agile Monkeys, this kind of deep dive reflects our core philosophy: we continuously explore, test, and select the best tools to build cutting-edge AI solutions. Our internal Learning Program empowers our team to experiment with new technologies, models, and techniques.
We’ll continue to publish our findings as we go. If you’re wrestling with similar architectural decisions or just want to swap war stories, we’d love to hear from you. We’re always ready to geek out over AI and software engineering!