Profile avatar
nathanpaull.bsky.social
AI Researcher | Data Science Consultant | Basketball Nerd
31 posts 16 followers 6 following
Regular Contributor
Conversation Starter

The biggest winner of the SEC Tournament for me is the Tennessee Vols. Gainey is playing with confidence and scoring at true 2nd option levels, Lanier is an offensive juggernaut, and the Vols defense is stifling. This could be the moment that starts a big March Madness run for the Vols.

The worst thing that Democrats could learn from the most recent election is that they were too far left and need to appear more conservative. Trump was not elected for his "conservative values", he was elected to destroy a system people felt no longer benefits them.

... watch solo leveling ...

I am struggling with my feelings on the stock market tumbling. On the one hand, it was so easy to predict the Trump's half-baked tariff plan would cause this. On the other, these are people's retirements that he is destroying. Whether they voted for it or not, it is sad how little Trump cares.

The Lakers have been super fun to watch, but I don't know that I am buying the playoff hype yet. There are few players on the team that contribute at a high level on both ends. This hasn't been a problem in the regular season, we will see what happens in the playoffs. Great basketball regardless.

Crazy that $TSLA has now lost all of its post-election gains. It really seems like the stocks value is contingent on Musk remaining by Trump's side. Given Trump's record of staff turnover...

Donovan Mitchell is quietly having an amazing season. It is crazy to me that he is not truly in any MVP conversations. He is easily the best player on the best team (by record) in the East and is currently ranked below Giannis, Lebron, Ant, and Tatum. Give this man the respect he deserves!

This has gotta be the worst season in history. From a finals run to this? They won't even get a good pick as a reward for all of this. Kick the Mavs out the damn league man, free these players and fans.

Hugging Face just entered the top 10 organizations on @github.com Close to 500,000 GitHub stars across our open-source libraries! Couldn't be more proud of what this 220-person team is accomplishing

The NBA ASG was terrible, but I can't disagree more with the opinion the NBA has become complacent because of the money. The reason why the ASG (and regular season) is bad is because players only care about winning a ring. Entertainment and competition are opposite in the NBA.

This team is interesting but they are not real contenders. Every player listed in this starting lineup has some serious limitations on the defensive end. DFS and Vando can definitely help, but without Father Time turning back Lebron's clock this team will be struggle against top tier teams.

With all the recent discussion around DeepSeek, I wanted to share my perspective as someone working to advance the science. The noise on this topic has been really frustrating, and it led me to write down my thoughts on DeepSeek and the future of AI. open.substack.com/pub/nathanpa...

This is very similar to the work I have been doing. Super cool to see that I am not the only one still believing in BERT. There is so much left to gain here. I will be interested to look into ablations around their local vs. global attention tradeoff and see their training data.

I have to say that Meta has been releasing many of my favorite papers this quarter with several giant papers coming out this past week. It seems like they are really trying to investigate the limitations and rigidity of next-token-prediction, potentially creating a path toward dynamic compute. 🧵

Comparing my new BERT-like model (pink) v. Transformer ++ BERT both pre-trained on ~30M samples and fine-tuned on MNLI. This is super promising and there is likely still more to gain in future iterations. Looks like a full training run and research report are going to be filling my weekend.

I have to think that foundation model pre-training continues to divide and specialize. In the past year we have seen the introduction of the annealing phase and I have to think that this allows for a more rough pre-training phase that takes advantage of low precision and/or structured matrices.

More experiments and model updates to come. This model is severely under-trained having only seen 32M samples (out of 300M possible) so far.

Finally got around to completing the first major training runs of my own BERT-like language embedding model. There is a ton of data to pour over as I prepare my next experiment for this weekend, but early results show my model outperforming a Transformer++ BERT model by 1% with fewer params!

Starting my first official training run on the path to developing my own BERT-like foundational model. Trying to manage my expectations but I still believe that this model will be a few percentage points better than its counterparts trained with MLM. Orion I is ready for launch.