mrdrozdov.com - Profile | ThreadSky | a Reddit-style client for Bluesky

mrdrozdov.com

Research Scientist @ Mosaic x Databricks. Adaptive Methods for Retrieval, Generation, NLP, AI, LLMs https://mrdrozdov.github.io/

170 posts 5,026 followers 604 following

Posts 33 Comments 17

comment in response to post

Embedding finetuning is not a new idea, but it's still overlooked IMO. The promptagator work is one of the more impactful papers that show finetuning with synthetic data is effective. arxiv.org/abs/2209.11755

submitted 2 days ago

comment in response to post

Search is the key to building trustworthy AI and will only be more important as we build more ambitious applications. With that in mind, there's not nearly enough energy spent improving the quality of search systems. Follow the link for the full episode: www.linkedin.com/posts/data-b...

submitted 2 days ago

comment in response to post

After frequent road runs during a Finland visit I tend to feel the same

submitted 36 days ago

comment in response to post

I’m being facetious, but the truth behind the joke is that OCR correction opens up the possibility (and futility) of language much like drafting poetry. For every interpreted pattern for optimizing OCR correction, exceptions arise. So, too, with patterns in poetry.

submitted 71 days ago

comment in response to post

Wait can you say more

submitted 71 days ago

comment in response to post

Yep! Slides are linked on the tutorial site: retrieval-enhanced-ml.github.io/sigir-ap2024...

submitted 80 days ago

comment in response to post

I guess it's not an official ad. bsky.app/profile/why....

submitted 82 days ago

comment in response to post

You should checkout dspy! The correct answer is not to do a grid search at all, but treat this like an optimization problem.

submitted 83 days ago

comment in response to post

submitted 85 days ago

comment in response to post

If they have remote courses under $1k then you're in the clear.

submitted 86 days ago

comment in response to post

Same same! I wonder if the slides team has been feeling heat from a competing product with a relatively streamlined workflow.

submitted 87 days ago

comment in response to post

3. You're incentivized to keep in character. If you get a lot of likes, then you get preference when choosing which role you'll have next. This bypasses the need for verification since the accounts are verified by definition, so there's no "fake" profiles. Although does not guarantee verified posts

submitted 87 days ago

comment in response to post

Similar comment from a reliable source. x.com/earnmyturns/...

submitted 87 days ago

comment in response to post

FWIW, even if the above holds, requiring external memory just to match quality of LSTMs still feels weaker than decomposable attention or transformers. Also, would make me think there is some version of vanilla RNN that is self-attention-like, which predates memory nets by a while.

submitted 87 days ago

comment in response to post

@thomlake.bsky.social I'm willing to be convinced. This would give me a whole new appreciation for the memory net work. Can you show that memory nets could process the whole sequence in parallel w/o a for-loop? IMO this is the key capability that self-attention enables.

submitted 87 days ago

comment in response to post

Okay, I'm officially confused > In this work, we present a novel recurrent neural network (RNN) architecture where the recurrence reads from a possibly large external memory multiple times before outputting a symbol But looking at eqns, the internal attn could potentially be done w/ masked attn?

submitted 87 days ago

comment in response to post

Memory networks are amazing. Although they are not attention only. The first sentence of the abstract makes it clear they’re recurrent models. > We introduce a neural network with a recurrent attention model over a possibly large external memory.

submitted 88 days ago