Profile avatar
alexanderking.bsky.social
82 posts 73 followers 63 following
Prolific Poster
Conversation Starter

The median home in the US used to cost 6.7 years of median income. Now it costs 10.

I think one of the reasons why I’m so interested in the AI race is that I think AI will be a net positive. I think people will be able to use it to better themselves and their communities. Every advancement means that humanity is more capable of solving problems than before.

Forget “tapestry” or “delve” these are the actual unique giveaway words for each model, relative to each other. arxiv.org/pdf/2502.12150

@nytimes.com is a joke. The main job of an academic researcher is to openly publish their work. Anyone who receives federal research funding from the NIH has to make their final manuscript AND research data publicly available.

For what it's worth, Early Grok 3 via lmarena got 3/10 on the simplebench sample questions. Which would place it at around Gemini 2.0 Flash Thinking and DeepSeek R1. Below Claude 3.5 Sonnet and o1, and above o3-mini (high) and GPT-4o. So it's nice at these understanding-based tasks. Not the best.

The wild thing is that incomes for the bottom 20% are legitimately higher than they were in 2019, but the decreases since 2021 have probably come with a lot of pain.

Grok 3 Fail? Perhaps, not good at coding???

One reason I’m highly skeptical of grok 3’s supposed performance is that I would not put it past Elon Musk to overweight the benchmarks in the training data.

Hot take: This article is bad. There's no methodology described for their research. There is no link to an actual research paper that would describe it (assuming there is one). Also it doesn't address U-6 unemployment or growing wages

I remember when tech people (especially on Threads) were calling these the second coming. For reference, Apple sells about 25-50 million Apple Watches every year.

AI robotics is going to run into the same issue self-driving cars have run into.

Too ez

The Apple Vision Pro makes a lot more sense as a product when you see it as a progress report for investors rather than an actual product to be used

I asked LLMs whether or not they’d destroy 1 loaf of bread in South Sudan or 10,000 loaves of bread in the US. Gpt4o and o3-mini would destroy the one. Gemini thinking, Claude sonnet, and DeepSeek R1 would destroy the 10,000

Nvidia has a lower P/E than Costco or Salesforce. I think it just speaks to how much money Nvidia is actually making right now.

Both the sentiment here and the pessimistic replies fail to acknowledge a central quality of humans. Human behavior is largely driven by the incentive structures and systems that they are in. If you look across cultures and history, human behavior can vary wildly.

This is the most relevant article to NIH and research cuts I’ve seen. Imagine if this was today , how many people would be saying “Why are we studying Gila Monsters and their impact on diabetes ? That’s wasted money !” globalnews.ca/news/9793403...

If computers were originally supposed to be bicycles for the mind, then every AI advancement is adding a more and more powerful electric motor to that bicycle. AGI is a self-driving bicycle. You can see where there might be a hangup in getting there.

By far the most shocking thing to happen in Dallas in human history.

Almost everything is easier said than done.

Severance is a phenomenal show.

An interesting thing about the simplebench benchmark, (with questions designed to trick LLMs) is that most of the variance seems to be explained by the LLM's performance on language understanding benchmarks and scientific understanding benchmarks.

I think that textual LLMs can be a genuinely useful assistant and consultant for people in a lot of cases. I think it will be a genuine benefit to humanity. I also think that AI generated “art” is an affront to the human endeavor and that anyone who enthusiastically creates it cannot be trusted.

One of the interesting things about Simplebench and Livebench is that raw reasoning capabilities don't actually correlate well with the linguistic adversarial robustness needed to do well on Simplebench. Simplebench correlates best with language understanding.

The reducing costs of developing AI was always going to happen, but it’s definitely happening faster than people expected.

Cats

Severance is one of the best TV shows of all time.

This is a very striking graph. Especially if you believed that the way to have a popular program is to propose things that polls show there is already support for.

Wikipedia is here!

Cardiovascular disease mortality has declined in many countries over the past decades. For example, US cardiovascular disease death rates have declined 4x since 1950.

Devastating for scientists and their research, and everyone who benefits from that research. I’m not sure if everyone outside academia is aware that a delay or “pause” in grant funding often means the researchers themselves are lost from the field, along with their expertise.

Inflation compared between countries, where price levels on January 2020 are set at a value of 100 for each country. The black line is what 2% annual growth would look like.

I'd argue that the reduction in home ownership tracks pretty well to house prices outpacing incomes for that age bracket.