Profile avatar
sidsr.bsky.social
🇺🇸🧸 TIME Person of the Year 2006 NASA mission patch enjoyer https://github.com/sidharthrajaram
30 posts 63 followers 211 following
Regular Contributor
Conversation Starter

Harlequin, by Pablo Picasso, 1923

Posting my first ever paper on BSKY as we inch towards the 7-month anniversary of its publication in Materials Advances.

Never have my biases been confirmed so thoroughly. Great read by @glennklockwood.com (for anyone who broadly wants to see how LLM training in the data center works)

🤖🛫

"*reframing* the AI policy debate" what? For 6 months, there have existed open source models from China that outgun similar sized models from OpenAI on most benchmarks. Research in China builds on research from the US, and vice-versa. This is not some crisis. The only debate is on export controls.

a team from berkeley has put out a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks (built upon the qwen2.5 series base model) "Through RL, the 3B base LM develops self-verification and search abilities all on its own"

nvidia staying a top 5 market cap company in the long term would be like if the store that sold the shovels during the Gold Rush was still one of the most valuable companies in SF. We’ll know AI’s utility and bounty is actually diffusing throughout society when that turns.

How does one square away these (paraphrased) claims by certain elements of industry: 1. “The US must stay ahead of China in the AI race. It’s critical for national security.” 2. “We must be allowed to sell GPUs to China. It’s critical for national security.” (Answer key below)

en.wikipedia.org/wiki/Mikoyan...

For anyone experimenting with Anthropic’s MCP who wants the server to be a separate process from the client, I put together an example client/server for this using SSE: github.com/sidharthraja...

a single image from a film that immediately makes you hear the song from the soundtrack playing in your head

You can run Qwen2-VL-2B on your Mac with Llama.cpp (instructions in the README). The Qwen series has been one of the leading multimodal models for a couple months now. Fun to run it on the Mac. Get the GGUF files and instructions here: huggingface.co/sidrajaram/Q...

bring back over-the-top, artistic ads for computing

Really cool new work out of Deep Mind for video game world generation using latent diffusion! Soon you'll be able to speed run a game just by tricking a model to morph you from one location to another. deepmind.google/discover/blo...

appreciating humanity's splendor by looking at ISS mission patches on wikipedia. A few favorites:

multi-agent systems for planning and reasoning (1995, colorized)

the Cold War was different. they would over-engineer something insane like an air-to-air nuclear missile or a chicken-powered nuclear land mine and then give it a code name like 'Ding Dong'.

definitely one of the quotes of all time

answer = s3_client.prompt_object("my-bucket", "prefix/to/image.png", prompt="What's going on in this image?") thoughts on this kind of abstraction for interacting with data on s3?

Open source AI & ML starter pack! Cool people / accounts I came across on here: go.bsky.app/3RatSNe #ai #opensource #ml #machinelearning

we recently put together `PromptObject`, which makes asking questions and talking with objects (like images/documents) as easy as one line of code. GET, PUT, and now PROMPT (!!) #ai #ml #data #minio #computervision #s3