Back to articlesChoosing your next CLI: Codex, Claude, Warp, Goose, or Gemini?

29 Jul 20257 minute read

Zachary Galbraith

Contributing to the AI Native dev movement and sharing insights with the community

AI Tools & Assistants

Terminal

Table of Contents

Why Are Some CLI Tools Better Than Others?

Choosing your next CLI: Codex, Claude, Warp, Goose, or Gemini?

29 Jul 20257 minute read

Recently, we have seen an explosion in CLI tools. With so many different options, picking the right one can seem difficult. In this blog, I’ll break down what makes a CLI tool good, compare the most popular ones, and help you pick the right one based on your budget and preferences.

Why Are Some CLI Tools Better Than Others?

Despite using the same models, some CLI tools can perform significantly better than others. Why is this? In my experience, it comes down to three main factors.

1. System Prompts: The quality of system prompts vary across tools, in turn leading to differences in performance. For example, Claude Code has a carefully tuned system prompt to introduce available tools, roles, and adjust behaviors, whereas some open source tools may expect you to tune the system prompt.

2. Memory Management: Good CLI Tools diligently manage their model’s context. Understanding what types of context to prioritize is what makes modern CLI Tools powerful. This might mean passing in a bullet list of user preferences in every query, or adding documentation based on the user prompt.

3. Looping: Some tools are better suited to handle errors than others. Good tools provide models with superior context about issues and how to fix them.

Claude Code

Ranked third on Terminal Bench, and being one of the most popular AI CLI tools, Claude Code is a great choice for most software developers. It takes advantage of Claude 4 Opus and Sonnet models, which have been dominating software development benchmarks recently. Claude Code is also known for its great tool usage capabilities. Some users have found Claude Code essential for their workflow: “I've been using Claude Code extensively since its release, … It's so effective that I've been able to handle bug fixes and development tasks that I previously outsourced to freelancers”. Pricing starts at $20/month for Pro access, going all the way up to $200/month for Max, notably with no free tier available.

Gemini CLI

Sadly, the Gemini CLI isn’t yet on the Terminal-Bench leaderboard, suggesting either that it’s too new, or didn’t make it. That being said, I have seen comparisons online suggesting that the Gemini CLI competes with other CLI tools such as Claude Code. Gemini CLI is open source and uses reasoning loops with built-in tools to complete coding tasks. It supports multimodal inputs, allowing cool development workflows like developing a website based on a rough sketch. The claim to fame is the generous pricing, with 1,000 requests per day at no charge.

The general sentiment from users appears to be that while the free tier is great, it doesn’t stack up to the likes of Claude Code: “From my perspective, Gemini CLI is useless—it’s the only LLM that has repeatedly refused to work with me because of so-called "offensive language." And when it does work it's overly verbose and often just plainly wrong. … Claude Code wins for me”.

Codex CLI

Ranked 19th on Terminal Bench, Codex performs significantly worse than other CLI tools. However, OpenAI’s Codex still has viable use cases for development. One benefit that Codex has is its support for the multimodal inputs that OpenAI’s models support. Additionally, OpenAI has developed a model specifically for this TUI tool, suggesting future support. A user noted, “I tried Codex yesterday, and it cleverly navigates files instead of uploading everything… It helped me find the problematic file and function in a huge codebase,” highlighting its ability to work in large projects efficiently. Other users, though, have found it lacks the same power as other CLI Tools: “Because of the cost, I tried Codex, but after a few simple questions I gave up. Nowhere close to CC \[Claude Code\] when it comes to tool usage.”

Pricing-wise, Codex follows standard API rates, with Plus users receiving $5 in API credits and Pro users getting $50.

Warp

Ranked first on Terminal-Bench, Warp is a so-called ‘Agentic Development Environment’. It has a GPU-accelerated UI and a modern terminal UX with features like block-based output. You should check out Warp if you want a new IDE experience focused on development with Agents. One user said,“I started using WARP with Claude sonnet-4 engine. This thing is absolutely awesome. If you are not using AI boosted terminal you are probably wasting time.” You don’t have to choose between Claude Code and Warp either, some users use Claude Code inside Warp, using Warp instead for the UI improvements compared to traditional terminal applications. Pricing for Warp includes a free tier with 150 requests per month, up to a $40/month Turbo plan.

Goose

Ranked fourth on Terminal-Bench, Block’s Goose is a great, high-performance CLI tool for agentic development. Being open-source, Goose is supported through community efforts on their GitHub repo. The main advantage of Goose is transparency, with strong customization abilities. One user found, “So I gave goose a whirl and I actually really like the approach they are taking, … I would recommend people try it out on an existing project.”

Emphasizing its community feel, Goose doesn’t have any subscription tiers, instead relying on a bring-your-own API key approach.

Conclusion

While many of the CLI Tools use the same underlying models and seem similar, there are major differences between these development tools. The right tool for you depends on your budget, development process, and personal preferences.

If you don’t mind paying, and want the most powerful tools, go for Claude Code or Warp. Goose and Codex are good picks if you want to customize your tools or want pricing transparency. Gemini packs a punch for a low cost, so it’s a great option for budgeted development. The best tool is the one that fits into your workflow. I encourage you to try a few, and see what works best for you.