Unnecessary thievery

Several years ago, I was working on modeling the hypothalamic-pituitary axis with my colleague Joe Gonzalez-Heydrich. Unsurprisingly, we could not find any primary data in articles ostensibly describing the relationship between various hormones of this axis. So, I found a very nice shareware program called DataThief. DataThief is "a program to extract (reverse engineer) data points from a graph. Typically, you scan a graph from a publication, load it into DataThief, and save the resulting coordinates, so you can use them in calculations or graphs that include your own data." It worked as billed and recently when I was working with my colleague Asher Schacter on predicting outcomes of drug development from pre-clinical data, I remembered how useful DataThief had been and recommended that he use it to extract the primary data from publications for each of the pharmaceuticals he wanted to study. Lo and behold, it worked again!

If only we had a policy in place that required that all primary data be deposited in a public electronic repository or repositories, then this additional, laborious, and time-consuming step would be unnecessary. Bioinformaticians have been very effective in demonstrating the value of sharing primary experimental data (e.g. high throughput data such as gene expression data or gene variant data) but clinical researchers have yet to achieve the same enlightenment. Until then, please make sure your graphs are very accurate in your publications so that others may benefit from your hard work and the taxpayers' investments in your research.


BLAST this!

David Osterbur often gives extremely well received lectures on the use of public bioinformatics resources for biologists. However, even he is limited in how many audiences he can reach. So, if you know of a biologist who needs some help in the use of BLAST, or the UCSC Genome Browser or even in the search of information regarding herbs and dietary supplements, you will be happy to know that the Countway Library (in collaboration with the MIT Engineering and Sciences Libraries) has made available several instructional videos. Let me know if these are helpful and if you'd like to see more (and about what).


Out of date but not dated?

This is a great example of how the instant-at-hand-reflexive-cut-and-paste nature of electronic information can bridge the virtual to inflict real harm. Contemplate how clinical out-of-date information can be similarly used to boost the medical malpractice of the incidentalome. Will medical libraries step up to the challenge of keeping the medical profession up to date?

[Thanks to Ben Reis for the pointer]


The Harvard Catalyst site is live and open

The name of the Harvard University Clinical and Translational Science Center is Catalyst. Several of its resources are publicly available. For example, you can now see the biomedical scholarly output of our university at a glance. You can find people, buildings, phone numbers, directions and parking across the entire University (!) with 18+ participating institutions. You can see the influenza risk across our local geography and recent history. You can explore which clinical trials are supported by the institutions across Catalyst. You can use Webdash to share web pages and publications and their citations with collaborators and colleagues. You can browse and search the available Core facilities (in the hundreds). And if you need analytic help you can reach out to the Catalyst biostatistics program and genetics program, for example. Within a year, we will reveal the data sharing function called SHRINE which allows authorized users to study patient populations (with regulatory oversight) for pharmacovigilance, and various clinical research projects (e.g genome-wide studies of asthma, major depression resistant to standard antidepressants).

This site is the collaborative effort of multiple informatics groups in our community, including HMS Center for Biomedical Informatics, HMS IT, and the IT groups of Partners Healthcare Systems, Beth Israel Deaconess Medical Center, and Children's Hospital. It was an impressive 107 day dash bringing together diverse applications into one package. It's still rough and in progress and I would welcome your comments as would our Research Navigators.

Just some cocktail party conversation for you: Note the relative decline of protein research (relative to other topics) in the past decade at our University. The same indicators (gratifyingly) show the rise of mathematical topics in our life sciences scholarly output. Our most prolific author is Walt Willet (note the alternate ways his name appears each with its own publication history: to be fixed in the next iteration of Medvane). Note that JBC appears to be a popular journal for our authors to publish in.