Separating Real AI Value from the Hype
How to identify AI solutions that truly make a difference.
Good morning, everyone!
Today’s topic is inspired by Stephen Diehl, our most recent premium subscriber (thank you for supporting us!), who raised a good point about the hype versus the actual value of AI.
We’re diving into the growing hype surrounding AI in every corner of software, from everyday apps to advanced enterprise solutions. It seems every company is touting its AI capabilities, but how much of it is truly valuable versus flashy, empty promises—how do we separate AI that produces real value from AI that’s just there to ride the trend? Let’s answer that…
AI is everywhere, but not all of it is meaningful or useful. The vast majority of news and posts we see online are only fluff and can be easily identified, which is an important skill to develop to become more trustworthy in the field. Let’s explore how we can tell the difference.
The Hype vs. Real Value of AI
When useful, AI tackles specific, often tedious tasks that would otherwise take significant (and boring) human effort. Think of tools like GitHub Copilot, which helps developers write code up to 55% faster (as GitHub constantly reminds us*) by generating code snippets based on user input, or Zapier, which automates workflows like sending follow-up emails based on client interactions. These aren’t just gimmicks; they directly improve productivity and streamline workflows.
On the flip side, some AI tools can add complexity. For example, certain meeting-scheduling AI platforms may overengineer the process, making a simple task harder instead of faster. Just send a Calendly link configured once and linked with your calendar. Knowing how to spot the difference between useful and overly complicated AI is the key.
But how do we identify when AI is helpful? Here’s a quick checklist of what makes AI valuable, alongside practical examples:
It should reduce manual effort: AI is best applied to automate repetitive tasks we already know how to do. For instance, AI-powered invoice processing eliminates manual data entry using Optical Character Recognition (OCR) to extract data from documents and match them with purchase orders. Similarly, tools like Zapier help businesses automate email follow-ups, reducing the need for manual tracking—assuming you know how to follow up on such emails properly.
It should save time: In software development, tools like GitHub Copilot save time by suggesting relevant code, allowing developers to complete tasks 55% faster. It suggests code, but it doesn’t code for you. Ultimately, you will need a skilled person to provide an overview and check. See AI as some super intelligent and fast intern. But it still makes intern-level mistakes. In customer service, AI chatbots can handle routine queries, freeing up human agents to focus on complex problems, but a human should always be available for more complicated situations.
It should integrate smoothly: A great example of smooth AI integration is Microsoft Copilot, which is embedded into existing apps like Word, Excel, and Teams. This allows users to access AI-powered features, like document summarization and data analysis, without leaving their workflow. If a new feature requires a totally new UI and external experience to your current product, it won’t be a useful feature—it will be an entirely new product.
It should deliver accurate results: Accuracy is critical in many fields like healthcare. AI tools used in medical diagnostics help radiologists detect abnormalities in CT scans or X-rays, improving diagnostic speed and accuracy. In manufacturing, AI-powered cameras can detect product defects on assembly lines in real-time, enhancing quality control. LLMs are not deterministic, and there is some randomizability and even hallucinations. Ensure you minimize it with improved systems (e.g., RAG).
It should be intuitive and easy to use: AI should feel natural. LLMs integrated into customer service workflows allow for intuitive communication in plain language. There is no need to be an expert—just ask, and the AI responds. Similarly, Nimble’s project management tools offer personalized, AI-generated task suggestions that streamline workflows without requiring technical expertise. The AI addition needs to be seamless to your users.
It should solve a pain point: Use it to solve a problem you or your clients have that cannot be fixed otherwise. Don't start a new project with: "I want to add AI to our systems." That's a common mistake. Consulting sessions often begin with this statement, revealing a lack of understanding about AI. It's like asking a server in a restaurant to bring you food without specifying what you want—you need to be more specific. The first step is to get informed, just like you're doing now! A solid AI system starts with the need to solve a real-world problem that traditional software cannot address.
Research and AI Hype
The AI hype isn’t just limited to software products—it also extends to research. In the race to publish cutting-edge models, it’s common for researchers to claim their approach achieves state-of-the-art (SOTA) results. However, these claims often fall short due to differences in implementation, benchmark selection, or reproducibility issues. Researchers sometimes cherry-pick results to show their models in the best possible light, leading to inflated performance claims.
This creates a challenge for the entire AI field—when every new model is marketed as SOTA, it’s hard to differentiate true innovations from minor improvements. The exponential increase in AI research papers makes this even more difficult. We’re seeing a flood of papers that makes it harder for reviewers to evaluate each one rigorously, and this compromises the quality of published research.
Caption: The number of papers published monthly in the arXiv categories of AI is growing exponentially. The doubling rate of papers per month is roughly 23 months, which could lead to bottlenecks for publishing in these fields. The relevant categories are cs.AI, cs.LG, cs.NE, and stat.ML. Caption and image from Krenn et al..
With more AI papers published monthly, tracking what’s genuinely groundbreaking becomes harder. The result? Even within research, the AI hype is growing, making it essential to stay skeptical and follow trusted benchmarks and evaluations. This is why it’s important to build your evaluations and remain informed with trusted benchmarks (which will be the topic of another newsletter!) of what's state-of-the-art.
Conclusion
Implementing AI correctly is a game changer for productivity, profit, user experience…
The key to recognizing valuable AI is simple: If it helps you get your work done faster, easier, and with less friction, it’s useful. But if it makes your work more complicated, adds unnecessary steps, or seems like a magical tool replacing experts, then it’s probably AI for AI’s sake (or fake).
In the end, what matters most is that AI should work for you—not impress you with how advanced it looks. The best solutions focus on real-world problems, simplify workflows, and enhance productivity without requiring you to rethink how you already work (in most cases).