Using Claude for ML: Great at Scaffolding, Rough on Efficiency
I recently used Claude (Anthropic's LLM) as a coding partner for a project predicting patient appointment show rates. It was impressive in some ways β but frustrating in others.
Here's a quick breakdown of what worked and what didn't.
1. Full Pipeline in One Go β
Claude's biggest strength? Scaffolding.
With a single prompt, it gave me:
- Data ingestion
- Preprocessing
- Feature engineering
- Train/test split
- Model training (logistic regression + XGBoost)
- Evaluation
Only one minor bug, which it fixed instantly. Solid for bootstrapping fast.
2. Over-Engineered by Default
Claude tries to be "thorough" β sometimes to a fault.
In my case, it grabbed 10+ features without any real EDA or justification. No check for multicollinearity. No signal analysis. Just⦠everything.
Result:
- Risk of overfitting
- Harder to interpret/debug
- More compute than necessary
Next time: I'll prompt it to "use top 3β5 features based on correlation or SHAP."
3. Code Wasn't Optimized
This was the biggest pain point.
Even on a ~500K row dataset, Claude:
- Used
.apply(lambda x: β¦)
instead of vectorized ops - Wrote O(nΒ²) logic (e.g. nested loops over DataFrames)
- Recomputed expensive ops inside loops
It writes code that's correct and readable β but not performant. Good enough for toy data, but not production.
4. Sometimes It Won't Just Chill
One frustrating pattern: ask Claude a question, and it thinks you're giving instructions.
Me: "What if we tried a decision tree instead?"
Claude: rewrites the whole pipeline with DecisionTreeClassifier
I wasn't ready to change anything yet. I just wanted to talk it through.
Fix: Be specific. "Please don't change the code. Just discuss pros/cons."
TL;DR
What Worked β
- End-to-end scaffolding
- Clear structure
- Good at fixing bugs
- Fast prototyping
What Didn't β
- Over-engineered feature set
- Unoptimized (O(nΒ²)) logic
- Rewrites code without being asked
- Needs manual refactoring
Would I Use It Again?
Yes β with constraints.
Claude is like a junior engineer who works fast, doesn't ask questions, and ships verbose-but-correct code. But like any junior dev, it needs review and cleanup.
Next time I'll:
- Limit feature count up front
- Ask for optimization explicitly
- Be clearer about when I want discussion vs changes
Still a great tool β just not hands-off.