I recently used Claude (Anthropic's LLM) as a coding partner for a project predicting patient appointment show rates. It was impressive in some ways β€” but frustrating in others.

Here's a quick breakdown of what worked and what didn't.

1. Full Pipeline in One Go βœ…

Claude's biggest strength? Scaffolding.

With a single prompt, it gave me:

  • Data ingestion
  • Preprocessing
  • Feature engineering
  • Train/test split
  • Model training (logistic regression + XGBoost)
  • Evaluation

Only one minor bug, which it fixed instantly. Solid for bootstrapping fast.

2. Over-Engineered by Default

Claude tries to be "thorough" β€” sometimes to a fault.

In my case, it grabbed 10+ features without any real EDA or justification. No check for multicollinearity. No signal analysis. Just… everything.

Result:

  • Risk of overfitting
  • Harder to interpret/debug
  • More compute than necessary

Next time: I'll prompt it to "use top 3–5 features based on correlation or SHAP."

3. Code Wasn't Optimized

This was the biggest pain point.

Even on a ~500K row dataset, Claude:

  • Used .apply(lambda x: …) instead of vectorized ops
  • Wrote O(nΒ²) logic (e.g. nested loops over DataFrames)
  • Recomputed expensive ops inside loops

It writes code that's correct and readable β€” but not performant. Good enough for toy data, but not production.

4. Sometimes It Won't Just Chill

One frustrating pattern: ask Claude a question, and it thinks you're giving instructions.

Me: "What if we tried a decision tree instead?"
Claude: rewrites the whole pipeline with DecisionTreeClassifier

I wasn't ready to change anything yet. I just wanted to talk it through.

Fix: Be specific. "Please don't change the code. Just discuss pros/cons."

TL;DR

What Worked βœ…

  • End-to-end scaffolding
  • Clear structure
  • Good at fixing bugs
  • Fast prototyping

What Didn't ❌

  • Over-engineered feature set
  • Unoptimized (O(nΒ²)) logic
  • Rewrites code without being asked
  • Needs manual refactoring

Would I Use It Again?

Yes β€” with constraints.

Claude is like a junior engineer who works fast, doesn't ask questions, and ships verbose-but-correct code. But like any junior dev, it needs review and cleanup.

Next time I'll:

  • Limit feature count up front
  • Ask for optimization explicitly
  • Be clearer about when I want discussion vs changes

Still a great tool β€” just not hands-off.