Profile avatar
aarontay.bsky.social
I'm librarian + blogger from Singapore Management University. Social media, bibliometrics, analytics, academic discovery tech.
590 posts 2,845 followers 306 following
Regular Contributor
Active Commenter
comment in response to post
That's what I think anyway. More emperical research needed
comment in response to post
The same i think applies if the "ai academic search" is using at least in part semantic search based on embedding models. Technically its the difference between transformer Encoder models and GPT type decoder models.
comment in response to post
To be fair the extra prompt engineering trick may have made the LLM do better at the boolean but we don't know. Pretty sure research on LLMs people rely on don't test the narrow task of coming up with boolean
comment in response to post
Amazing thing if you try such long winded prompt engineering tricks, the LLM is typically smart enough to just ignore irrelevant parts and come up with a not too crazy boolean search (but typically too broad but that's another story). Its insidious because people think their extra prompt helped
comment in response to post
Eg some research suggests emotional or motivation appeals eg bribes seems to get better results from LLMs like chatgpt. But does it really make sense to input such statements when you know Primo RA is just going to try to convert what you type into Boolean keyword?
comment in response to post
Take Scopus ai or Primo Research assistant that takes your input and asks a LLM to convert your input into a boolean keyword search. I argue in this case doing a lot of typical prompt engineering tricks is suboptimal
comment in response to post
I suspect the fact that MS copilot/chatgpt is a LLM finetuned to be a chatbot with search as a tool has different implications in the way you should engage it vs a straight out academic search engine even one with "AI"
comment in response to post
Best way to show this is input "how are you" or a task like summarise xyz. Microsoft copilot, chatgpt will typically answer without a search , while the "ai academic search" eg Primo reasearch assistant will always search to try find relevant documents and RAG and answer with amusing results
comment in response to post
Microsoft copilot, ChatGPT WHEN they search are doing some type of retrieval augmented generation. The main diff is MS copilot /chatgpt search vs academic search engine is the former can "choose" to search or can respond like a usual LLM chatbot. While academic search engine ALWAYS searches
comment in response to post
To be fair, most vendors of "ai academic search engines" don't talk about it , either they don't know or it's too obvious to them or they try to pretend chatgpt can't search and score easy points for why not chatgpt questions? Another easy answer is we cover only academic content (2)
comment in response to post
Reminds me of a certain product meant for scholarly communications but often markets it by saying its advantage is it's exactly like your (library) procurement system.
comment in response to post
The video shows that you can do -slow+fast instead of -man+woman and get basically the same result as all you need is a pair of vectors that are close enough to cancel out. NOTE in many embedding models antonyms are close together cos conceptually similar(3)
comment in response to post
Argues that for most embedding models vector for King and Queen are two embeddings that are closest together (not counting itself) anyway, and given Man and woman are similar, doing -man+woman mostly cancels out, so you are pushed back to the vector closest to King (that isn't King) aka Queen(2)
comment in response to post
I agree. Thats why I got my institution pro access to Undermind. The only one that can match Undermind is Elicit.com their research reports/systematic review feature. But it's very pricey.
comment in response to post
Hence my view people typically undergraduates who don't know what a good literature review look like shouldn't even use these tools.. particularly untutored...
comment in response to post
i dont think it is an unsolvable problem. Up to now, people were trying mostly on Q&A type tasks, but as they move towards longer report generation, the problems can be reduced by a combination of careful handcrafted workflows + agent reasoning
comment in response to post
and i dont think Scopus AI natural language search can parse half as many fields as WOS RA can. In fact, I think most of the usual suspects will parse pub year maybe..and thats its
comment in response to post
WOS RA competitor Scopus AI has similar ideas of having specific workflows, but they are more restrained offering , "concept maps", "topic experts", "emerging themes". The topic expert like WOS top author can be hit or miss, but generally i find it less confusing since they always offered
comment in response to post
while it sounds nice to be able to ask for papers on topic X by authors from affiliation Y with citations > Z, it also means you can accidently trigger a workflow you didn't expect. I saw that in a demo where one query gets you a normal RAG answer another slightly diff gets the top author workflow
comment in response to post
but i agree, the problem with these RAG models & even deep research is like they are just seeing the trees (more like leafs) not the forest. You get a sense they are just looking at individual chunks and trying to mix them. Thats why they often cite papers in odd ways, not wrong per se but weird.
comment in response to post
i have no idea how they even identify it, its kinda crazy some times. probably some bibliometric magic.
comment in response to post
Add the fact that WOS RA seems to be able to recognize all sort of fields in natural language, not just pub year, but affiliation, country, journal issue vol, doi etc, citation counts etc. Look at the Types of questions WOSRA can answer webofscience.zendesk.com/hc/en-us/art...
comment in response to post
I also find Web of Science RA confusing as heck, besides the RAG search it seems there are like a ton of other tools/workflows available. There's "seminal papers", topic maps, Co-citation map (not same thing), topic over markness models, top authors all with diff visualizations. It's really a mess
comment in response to post
the main issue is even if i ignore this guided task and use the main search, it will still randomly decide to forgo the usual RAG answer (which tends to be okay) and instead try to find "seminal papers"...and there's no way to even get a clue why the results were surfaced.
comment in response to post
For example here i even specify the context of Binary independence model to be information retrieval when really you shouldn't need to, yet it still gives me as seminal papers "code switching" which is sociolinguistics??
comment in response to post
feels to me the seminal papers it finds is just totally weird or at best sort of relevant but way too far away. e.g. asking about large language models and getting LDA and Pagerank has suggested seminal papers!
comment in response to post
Well at least i THINK I understand. LLMs seem to concur as well but you know them , often they act like a bunch of brown noses
comment in response to post
No doubt. But they aren't as naive with GAI as some might think is my take away. I remember seeing similar findings in other papers too
comment in response to post
Something like this is now a standard slide in many talks. Doesn't seem to make a difference...
comment in response to post
This was inspired by a recent chat with a librarian and also the desperate need to come up with some content for a upcoming talk lol
comment in response to post
What do you think? -end-
comment in response to post
Lumping all these diverse tools under "ai" or "research assistant" means discourse above them gets super confused and we lose the ability to say this feature is relatively ok, this one is extremely tricky & should be used with deep caution
comment in response to post
BTW I've seen many talks where speakers dutifully show some Venn diagram showing overlap of Ai, deep learning, machine learning etc. But it's often just brushed away with no significance for the rest of the talk (6)
comment in response to post
Our friends from the evidence synthesis world would object to the catchall "Ai" because they want to distinguish traditional supervised learning, active learning and even some unsupervised learning/clustering techniques already in use vs research testing LLMs use, which makes sense of course (5)
comment in response to post
The only other type of tech that could be problematic is tools where you type some text and it tries to find a citation to fit. Some literature on recommenders calls this "local recommenders" as opposed to global. Note it has similar parallels to RAG by allowing generation without assessment(4)
comment in response to post
Compare to a Retrieval Augmented search like Primo Research assistant/Scopus Ai. It promises to give you a direct answer by summarising from top sources found.This is something new that goes beyond a normal recommender system with a lot of unanswered questions both about the tech & user behavior (3)
comment in response to post
When you think about it connected papers etc are basically recommending papers based on citation search.while some implementions are less transparent fundamentally you still have to read the paper, assess. Same goes for "semantic search" which can be seen as just souped up keyword (2)
comment in response to post
Possibly. But if @pwgtennant.bsky.social Q is why people are still on Twitter at this moment in time then for me this is a reason. Maybe at some point in future my Ghana, Zimbabwe, Ethiopia, Pacific Island etc colleagues will be here but they aren't currently & so a foot in other place is needed.
comment in response to post
Huh