Profile avatar
carl24k.bsky.social
#DataScience & #MachineLearning - #Neuroscience & #NeuroAI - Author of Fighting Churn With Data - https://linktr.ee/carl24k
186 posts 95 followers 200 following
Regular Contributor
Active Commenter
comment in response to post
Yes I remember that one - he thought his robot was going to save the day! It’s not just that he thinks tech will save the day, but he thinks he is an expert in everything - even when He knows nothing
comment in response to post
Christof was my PhD advisor!
comment in response to post
Yeah, once they are teenagers and start blowing you off you’ll get all your time back. That’s also when they can reliably do chores. But you will be sad that you don’t have as many chances to connect.
comment in response to post
Yes, I keep wondering who is in charge on this issue? Or is there a hashtag for it? #spamaccounts ?
comment in response to post
I just keep on reminding people it’s not reasoning - just picking the next token probabilistically. It’s shocking how smart people who should know better fall into the anthropomorphizing. And I bide my time waiting for the bubble to burst working on more rigorous ML
comment in response to post
Yeah it’s basically just a tabular data problem that’s imbalanced. And usually has a lot of correlated features. And binary prediction is not helpful in practice - you always need to rank churn risks, never classify people. the imbalance means you always have high false positive rate
comment in response to post
So thats what I would have said if I had been a reviewer at the journal! TLDR: Don't use #neuralnetworks for churn. They take much longer to train/predict and are less reliable.
comment in response to post
6. The authors don't cite Fighting Churn With Data, the only textbook totally about churn and data science. 😉
comment in response to post
5. The datasets have very few features (15-20). Real churn datasets are created with 50-250 features.
comment in response to post
4. Despite a lot of talk about class imbalance, the churn datasets are not very imbalanced - 10%-20% churn rates. Really imbalanced data is low single digit churn rates.
comment in response to post
3. #Gradientboosting is VERY interpretable with the #SHAPley method. They are totally misleading by saying their Deep Neural Network is more interpretable and boosting is not interpretable. They are apparently ignorant of these important advanced in interpretability more than 5 years old now.
comment in response to post
Importantly, the precision/recall metrics they show in their results will be sensitive to the thresholds which are not detailed, and thats a tricky issue for imbalanced data. This is another reason not to believe the supposed accuracy improvement.
comment in response to post
2. #Churn models should *not* be evaluated with precision/recall but rather with AUC: True/false churn predictions are NEVER used, but rather risk rankings. (Always use predict_proba for churn, never predict.)
comment in response to post
#ML researchers always do this - they slave away tuning their preferred model, and use their benchmarks "as is" without tuning.
comment in response to post
1. Gradient boosting (#XGBoost or LGBM) is state of the art for in the real world. I don't believe they put much effort into tuning their benchmark models, so don't believe the claims of higher accuracy.
comment in response to post
What is #bluesky doing about the bots???
comment in response to post
Wow, and I thought it was just me being avoidant because of my childhood trauma! I actually feel relieved it’s a trend lol (written from a meal out alone)
comment in response to post
typical tech bro - smart about maybe one thing, but mansplains about everything
comment in response to post
That’s interesting- tbh I was not thinking that big. You mean like the fact that so much of the recent stock market gains were big tech and nvidia? And if nvidia and Microsoft’s share prices crashed there could be a contagion. What else were you thinking?
comment in response to post
Yes I’m afraid all the #aihype has drawn money away from more traditional #neuroscience research. Seems like neuroscientists have to use industry tech like #deeplearning and #genai to get funding. Even when it’s not sensible. Hopefully #aibubble bursting won’t affect basic research 🤞🏻
comment in response to post
Survey of young people showed 50% wish TikTok had never been invented, but only 10% wish YouTube had never been invented. So there’s something different about them www.nytimes.com/2024/09/17/o...
comment in response to post
While people like me will point to our skeptical linked in posts from the past two years to show we were right, haha
comment in response to post
Maybe. But there really are some valuable things in this wave. And not all the things people think. I’m personally more excited about #CausalML than #LLM! No one knows about CML but it will be remembered as a product of this wave.
comment in response to post
Seems like any technology going through the classic gartner hype disillusionment value cycle
comment in response to post
@tonyzador.bsky.social ‘s piece in the @thetransmitter.bsky.social is a good one www.thetransmitter.org/neuroai/neur...
comment in response to post
Caltech CNS was different- it’s Computation and Neural Systems, not systems neuroscience. Look it up. Carver Mead and Terry Sejnowski started it way back when
comment in response to post
Sounds like the same thing as the Computation and Neural Systems interdisciplinary program where I did my PhD at Caltech 20 years ago (#caltechCNS) I hope the rebranding works!
comment in response to post
Does it explain what #NeuroAI actually means?
comment in response to post
Will there be a livestream?
comment in response to post
I agree that part sounds like marketing. I’m not convinced we know what the aspects to simulate are yet - even if we knew all the connections we would still not get a working system.