San Francisco, 19-20 April 2013

"The Truth of Personalized Medicine: Our Commons Future"

YIA Statements

The 2013 Young Investigator Award competition generated a large number of excellent submissions to the 4th Commons Congress from a spectrum of young researchers. Excerpts of the interest and activities statements are posted below as an introduction to the impressive young investigators participating in this year’s Congress:



Amrita Basu  As a postdoc at the Broad Institute, I have had a unique opportunity to engage in collaborative science with projects aimed towards a common goal to accelerate science and therapeutic invention. My research involved trying to discover the genetic predictors of drug sensitivity in cancer cell lines of diverse lineages through computational and experimental methods. As part of the Chemical Biology team, I engaged in heavy collaborations with projects such as the Cancer Cell Line Encyclopedia and Achilles RNAi groups. Without extensive knowledge regarding the behavior of hundreds of cancer cell lines that were profiled, our team would have had a limited understanding of why a single cell line is more sensitive or resistant to a particular drug. Many possibilities exist: we may have attributed drug resistance solely to lineage specifications, or cellular growth rates. In a majority of these cases, we would have been wrong. Because our collaborators facilitated transparency and accessibility of data, we were able to achieve an extended understanding of factors beyond lineage, such as expression, copy number, and mutation status, and thus analyzing these data led us to further probe biological connectivity within cancer signaling pathways. In this experience, deeper transparency in science played an important role in obtaining higher-quality results, so how can we achieve the sharing of knowledge more efficiently? At the Broad, I was part of a team that implemented an automated pipeline for integration, modeling, and complex visualization of high throughput and biomarker data. This process was accelerated by scientists that merged data continuously, mapped cell line names across platforms, and tracked metadata, all so that the data could be more usable and interpretable to diverse sets of audiences. These tasks required heavy lifting and can cause challenges for scientific environments that have limited staff, lower budgets, but still have to compete within the scientific community at large. To enhance capabilities and to give more power to the scientist working directly with the data, more software tools to handle large amounts of data and integration tasks should be available. In other words, data sharing should not be limited by the burdensome task ahead, but should be enhanced by capabilities that allow us to integrate ‘omics’ datasets efficiently and produce algorithms that can quickly enable scientists to start from a common framework and speak the same language. In addition, scientists should get rewarded more generously for collaborative science within the peer-review process and recognition should be awarded to those team members who actively participate in making these transparency efforts seamless.


Geoffrey Siwo  The success of personalized medicine and open science will require the engagement of human subjects at a level that has never occurred before in history. Human subjects not just as study participants providing data but also actively engaged in asking questions, giving feedback and even directly contributing to discovery. The patient has enormous power to drive discovery especially through real-time access to data that could impact how we discover medicines. Open science that directly engages ordinary human subjects requires innovative platforms that were unimaginable a few decades ago. That is why I am developing computing platforms that empower ordinary citizens to become part of drug discovery. As a start, I have developed a game called Fit2Cure that uses gaming to leverage the spatial reasoning skills of humans to find drugs. While Fit2Cure is in its early stages, my hope is that one day patients could be able to contribute knowledge for the discovery of drugs to their own disease. That is, patients helping personalize drugs to their disease. As a step towards this vision, I have also developed a tool- Fit2CureMyDisease- that connects to 23andme based personal genetic data of gamers, to inform them of any relationships between proteins in the game and mutations in their genome. This way, a gamer can choose to play/contribute knowledge on proteins associated with their disease. While engaging ordinary human subjects in discovery has enormous potential, collaborative discovery platforms such as DREAM and Sage Bionetworks are extremely important in making scientific discoveries. I think that these platforms have a huge potential in advancing drug and vaccine discovery in infectious diseases such as malaria. They could attract a large amount of computational expertise to these diseases which is currently lacking. I am looking forward to participating in the Sage Congress to partner with interested organizations in laying the foundation for a computational platform that leverages Synapse and DREAM to contribute new computational models for malaria drug discovery. As I finish my PhD this year, my plan is to crowdsource through DREAM, Sage and online educational platforms for machine learning, a wide multitude of experts in various fields in building predictive systems biology models for drug mechanism of action, resistance and safety, primarily for malaria. To achieve personalized medicine, data sharing directly from the public will be necessary, irrespective of their disease state. The public needs to be rewarded and deeply engaged to share data openly. They need to know real-time progress of research based on their data. I look forward to sharing ideas on the role of patient driven discovery, crowdsourced challenges for disease models and computational environments that link the two towards personalized medicine.


Nolan Nichols  Big Data and Open Science are having society-wide impacts. However, open and collaborative models are not ubiquitous within the scientific community, even within the biomedical sciences where rapid progress has the potential to greatly improve human health and end suffering. While engineers and molecular biologists have learned and enculturated values that stimulate progress, such as reusable computer code and in-silico discovery, human cognitive neuroscience and neuroimaging are still struggling to realize the potential of these approaches. My career goal is to innovate informatics methods that overcome barriers to data re-use and sharing, and to foster reproducibility and discovery in the field of cognitive neuroimaging. By engaging the open neuroscience community, I’ve aligned my dissertation research with grass root efforts to make public human brain imaging data dramatically more accessible and usable for scientists. My focus is on a recent imaging modality called resting-state functional MRI (rs-fMRI) that shows promise for developing imaging biomarkers for neuropsychiatric disease. However, rs-fMRI analysis methods are still in their infancy and experiencing rapid growth that is producing data that information systems in place today were not designed to manage. My research takes a collaborative and community driven approach to investigate informatics problems related to rs-fMRI data sharing by applying modern Semantic Web and Cloud Computing technologies to develop a framework that meets the evolving information needs of rs-fMRI researchers.


Joel Wagner  My research is dependent on having access to large data sets of genomic, proteomic, and
pharmacological information. For example, my current work is being driven by data provided in two Nature papers, one from the Broad Institute and Novartis (Barretina et al. 2012) and one from the Sanger Institute and Massachusetts General Hospital (Garnett et al. 2012). From these projects and the many samples they measured I have been able to develop hypotheses I never could have explored otherwise, because performing experiments on that scale requires so many resources. Moving forward, I hope to contribute to publicly shared data by publishing all results related to in vivo pharmacological profiling of patient-derived xenograft tumors in mice. In the future I hope collaborations between multiple institutions, including between industry, academia, and nonprofit organizations, maximize output while minimizing redundancy. I would like to see more pre-competitive collaboration, while still respecting the needs and desires of individual stakeholders. I think we need to consider the entire drug development ecosystem, even when performing early stage experimental and computational analyses, so that we do not proceed with discoveries that are untenable in the clinical, regulatory, or insurance environments. Open science will encourage and enable this necessary foresight.


Alexander Morgan I am currently a student at Stanford University School of Medicine and do research in genomics and on the translation of omics technologies into improving medical care. One of my major research interests is in the clinically relevant interpretation of human genomes. As part of my Biomedical Informatics PhD dissertation work in the lab of Atul Butte, I developed a set of techniques for interpreting the likely clinical impact of the millions of DNA bases in a nearly complete genome. This involved using all studies we could access to link common genetic variants with phenotype, creating a simple set of summary statistics and an overall health report card, the Risk-o-Gram. The only way to approach even a surface understanding of the impact of individual variations on something as complex a human genome is to integrate a massive amount of information from thousands of previous studies. Doing this type of multiplex meta-analysis relies on having open access to results to reduce biases intrinsic in limited reporting of study results. Current standards for sharing the results of experimental studies are often greatly lacking. For example, many studies we examined reported only that a particular location in the genome was strongly linked to disease risk but did not report which nucleotide base increased risk and which decreased risk. Going forward, we know that human physiology is an exceedingly complex, interconnected system. As we strive to create increasingly effective and personalized medical care, one of the major challenges will be unintended side effects caused by this complexity. To address this issue, we will need access to primary research results from multiple studies across a wide range of experimental modalities.


Benjamin Good Conducting science openly has not just accelerated my research, it has made my entire career possible. In my current position, my primary responsibility is to increase the quantity and effectiveness of community contributions to open, collaborative initiatives in genomic science. For example, I work on the Gene Wiki project within Wikipedia and on scientific discovery games such as ‘The Cure’ at If science was not conducted openly, if knowledge and data were not shared, I (like most people in bioinformatics) simply would not have a job. Beyond this requirement for access to open data in order to pay my rent, I have found that openly sharing the products of my own research via blog posts, preprint archives and open access journals have helped in unpredictable ways. For early stage ideas, feedback from comments on blog posts has literally shaped the direction of research projects. Publishing in open access formats has unquestionably resulted in increased visibility for my work. In fact, the chain of events that resulted in finding my current position were the direct result of a project which I published only in an open preprint archive.


Kubra Komek As a computational neuroscience PhD student with a strong interest in schizophrenia, my research focuses on the computational models of schizophrenia, mainly oscillatory disturbances and the underlying neural mechanisms. Computational models in neuroscience are extremely useful tools for many reasons including their practicality, predictive power, and generalizability. However, without the necessary empirical data, it is extremely difficult to come up with realistic models. Although the ideal is to gather your empirical data and to form your computational model yourself, that is not so efficient. Therefore, open science with easily available data from that sense is extremely crucial for a more efficient and effective progress of computational neuroscience. Furthermore, the ultimate goal of most clinical research is to gain a better understanding of the disorder and to look for ways of treatment, which has strong ties with the pharmaceutical industry. My prior internship experience in the pharmaceutical industry made me appreciate the clinical research and the difficulties it faces repeatedly. In that sense, I believe that modeling work would make the drug discovery a lot more efficient and the testing of such drugs would be much faster. Combined together, the use of computational models based on empirical data and their use for the drug discovery is my ultimate goal and the path to personalized approach to medicine. Given that no two individuals with schizophrenia display the exact same symptoms, the formation of personalized medicine becomes a must. For this purpose, I believe that attending the Sage Bionetworks Commons Congress would give me the opportunity to build collaborations with clinicians and pharmaceutical companies to learn from their experiences and to talk about the potential barriers regarding access to patient data as well as discuss my personal experience of the modeling work I am working on. Finally, forming a global database with actual patient data that could better guide the modelers in their search for biophysical parameters needed for their dynamical systems modeling could be extremely effective and I would be very interested in helping out with such a project.


Fuhai Li My research interest is to develop computational and statistical approaches for target discovery and drug repositioning. I have worked on bioimage informatics and software development in high content analysis (HCA) for drug and target discovery. However, sometimes it is difficult to interpret the mechanisms of action the identified drugs and targets in complex diseases. Thus now I am interested in developing systems biology approaches that integrate large scale pharmacogenomics and interactome data to uncover underlying mechanisms of diseases and reposition existing drugs for new indications. In my opinion, two most challenges of conducting science are 1) solving the significant problems, and 2) getting the right tools as soon as possible. The time of one scientific researcher is limited. Conducting science openly and transparently will enable me to accept the critiques and suggestions from other experts, to let me know whether I am working on a significant problem that is important for the research field. I do not want to waste time to work on some projects that have been solved by others or are not important for the advance of science. Moreover, many tools and techniques that are important for my research might have been developed, the supports and collaborations from others will save my time to re-develop these methods, and focus on the real challenging part of my research. Just like, as a writer, he needs to focus on the essay, rather than to develop the microsoft “Word” first. In a word, with limited time, we should focus on the real challenging science with the ‘cutting-edge’ tools, while the ‘conducting science openly and transparently’ is the premise.


Inhan Lee As a bioinformatician, I have found many experimental data sitting on desks not having been published nor even fully analyzed. I also found repeated mistakes due to not knowing others had done the same work. What a waste. After multiple collaborations with several experimental groups, I came to understand the value of data generated by experimentalists, some of whom may have once considered me a scavenging dilettante. There remains a huge gap between theoretic and experimental scientists which only open science can bridge. I founded miRcore, a nonprofit organization, in July 2009, with a vision to democratize medical research. I want changes now because they‘re feasible. miRcore’s driving principles are that the general public (patients and family) initiates research that matters to them. Through their donations, patient genomic data (transcriptomes) are collected in a standardized manner in collaboration with doctors or research groups. The high quality data obtained is analyzed and provided to experimental collaborators as initial IP sharers. The data will then be available to other theoretical scientists for additional findings (even before publication). Towards our goal, educating the public is crucial; I have started to educate high school students who can advocate now and become part of the educated public in the future (if not professionals). Interestingly, I found that high school students actually enjoy learning genomics, which led me to expand programs for them. This school year all public high schools in Ann Arbor have chapters of GIDAS (Genes In Diseases And Symptoms), in which students learn about personalized medicine in genomics and share their knowledge with other students. I hope that when they become scientists/doctors, they will readily share data and information. It will be my life’s work to build trusted research networks for collaborations among patients, medical doctors, and scientists.


James Costello My research, primarily systems and network biology, is focused on understanding how genomically encoded elements, epigenomic modifications, and genome architecture coordinate to produce phenotypes in human disease and response to small molecules. It would be impossible for me to effectively build computational models on data that was limited to what can be produced in one lab. I would take this one step further and say that the entire field of systems biology is founded inherently on the principle that large-scale, genomics data needs to be shared to make advances. We see new and improved data producing technologies on a yearly basis and the challenge has become how to effectively incorporate this new data into systems models. My research relies on other people developing methods to transform raw data into measurements that can be incorporated into the models I work with. These are strong points for how my research is accelerated by others conducting science openly and transparently. I think the reverse is also true, in that the models I develop help others understand how their data are being used. This feedback loop provides insights into how both of our research projects can be better and more effectively used. This only happens when data is active and freely shared and is a core value that I maintain in my research.


Michael Menden In molecular medicine barriers exist among competing research labs as well as limited information exchange with pharma companies. Researchers and the pharma industry are driven by publication pressure and protecting patents, respectively. Therefore, both of them are often afraid of transparently sharing their data and knowledge. I believe, for overcoming such fear the best attempt is communication and trust, to work as colleagues rather than competitors. In such an ideal research environment I could investigate the drug efficacy in cancer in a much more meaningful way. For example, the chemistry of drugs plays a major role in the response, however, for a large number of compounds the 3D structure remains kept secret although this might be the key factor for a better molecular understanding. There are pioneer efforts in this direction, for instance some pharma companies are building research partnerships. I work at the European Bioinformatics Institute, whose mission is to openly distribute data and provide analysis methods. I am enthusiastic to work in such an environment. I am looking for more opportunities to distribute my methods further as well as learning from others, and I am very excited of the unique opportunities that Synapse provides for this. In my opinion, establishing more fundamental collaborations and building open scientific networks is the future of modern medicine, which will ultimately lead to better patient care.


Tony Yang In a world where healthcare research is increasingly more data-intensive due to the proliferation of digital technologies and the demand for answers in the current era of fast-paced interdisciplinary innovation, open research and data sharing offers many benefits to the research community. The predominant benefit is accelerated scientific progress. Advances are clearly valuable to the healthcare research community when translated into improved patient outcomes, reduced research costs, and decreased time in moving discoveries from the bench to the bedside. With access to original datasets, healthcare researchers can verify results more readily and accurately. Transparency also allows researchers to understand how data was compiled and how conclusions were drawn, making the ability to determine if a particular dataset is an appropriate source for secondary research much easier. Of more immediate benefit to healthcare researchers, conducting research transparently increases the visibility and relevance of research output. Open research generates opportunities for additional publications through collaboration, and may increase the citation rate of primary publications. Since publication history and citation impact are often considered in future funding decisions, these benefits are likely to accelerate research programs, and thus enhance the reputation of the academic institutions.


Comments are closed.