Profile avatar
aelouass.bsky.social
CS / DS / ML / AI (whatever it is called now) Ph.D. Engineer. Past: INSA-Lyon, LIRIS. Main topics: sequential and temporal data, models and, reasoning. Currently playing with manufacturing data and, interested in RL and Graph DL.
37 posts 130 followers 1,348 following
Regular Contributor
Active Commenter

@jfoerst.bsky.social take on how the community sees the ARC Challenge and how we evaluate models and use benchmarks nowadays is 👌. #more_science_less_hype (please). PS: Amazing discussion and good brain food, as usual with MLST.

I missed this one when it came out but I can tell that it is one of the most useful piece of research I’ve read in a while. “GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models” arxiv.org/html/2410.05...

We really need better brain-power allocation. The current algorithm is kind of turning crazy.

100%

The more I read and listen to current debates in the field, the more I’m convinced that we have a model evaluation crisis.

I never understood people going to concerts to spend their time there attending through the tiny screens of their phones.

Is it just me or are we in an Eliza effect pandemic?

Huh, what a year ! Happy new year, everyone ! May it be a better one than 2024 (it’s not that hard, though) Take care of your loved ones.

There is no ultimate benchmark. Having good results on a benchmark means that a model cracked it down. How it cracked it down shows the extent of the progress towards solving the problem=>sometimes cracking a benchmark tells you more that it is not sufficient to measure progress anymore.

🤞🏽

Definitely a good book. Not a textbook but a good resource for anyone interested in the field and have some math literacy.

An updated intro to reinforcement learning by Kevin Murphy: arxiv.org/abs/2412.05265! Like their books, it covers a lot and is quite up to date with modern approaches. It also is pretty unique in coverage, I don't think a lot of this is synthesized anywhere else yet

I think I've got more interesting insights from here's feed in not even a week than i got from the other place's in more than a year.

The unofficial GIF-based pandas library documentation. pandas.DataFrame.rolling

I just deactivated my X account. Blsky now has exclusivity on my procrastinating scrolling activity.

Hello World!