From: Comeau, Don (NIH/NLM/NCBI) [E] Sent: Tuesday, March 04, 2008 10:13 AM To: NLM/NCBI List ncbi-seminar Subject: CBB seminar, Tuesday 3/4/2008, 11 am, B2 Library CBB seminar, Tuesday 3/4/2008, 11 am, B2 Library Don Comeau What Are You Looking For? PubMed Queries and Evaluating Retrieval with Machine Learning Examining sample PubMed queries shows that they usually contain known phrases rather than merely series of words. To determine whether this observation could be used to improve retrieval quality, we turned to machine learning rather than human experts. Bayesian machine learning can clearly distinguish between documents with a particular phrase in contrast to documents simply containing the words of the phrase, as retrieved by a Boolean AND of the query terms. It also showed that documents containing the query terms in a single sentence are more like those that contain the exact phrase, than documents that only contain the query terms.