Sep 23, 2014

Summer school on synthetic biology

During the week of September 15-19, 2014 I participated in the summer school on societal implications of synthetic biology. Organized by Kristin Hagen and Margret Engelhard from the European Academy of Technology and Innovation Assessment and by Georg Toepfer from the Center for Literary and Cultural Research Berlin, it was held in Berlin, Germany, at the Center for Literary and Cultural Research.

Participants came from different countries - Austria, Italy, Germany, the Netherlands, Canada and the United States. Similarly, their backgrounds were quite diverse - biology, chemistry, philosophy, sociology, political science, and communications. The main goal of the school was to have an interdisciplinary discussion about synthetic biology as an emerging area of science and its implications for society. Participants wrote papers and presented them at the school. Additionally, several experts from various fields gave their talks. Below is a short summary of what we talked about:

The meanings and metaphors of life. Synthetic biology inevitably raises questions related to our understandings of life. On one hand, there is no universal definition of life and both philosophers and scientists continue to ponder over whether it is even possible to come up with such a definition. On the other hand, there may be no need for such definition, because a) we have an intuitive understanding of what life is and adapt as it changes, and b) having limited definitions works for specific purposes, such as understanding of how to create an artificial cell or argue against the scientific possibility of creating life from scratch. Metaphors that we use to answer the grand questions of life or to promote scientific advancements in synthetic biology bring together the domains of nature, artificiality, control, and aesthetics. Those metaphors are not “innocent” as they open some opportunities and close others.

Synthetic biology (SB) as a field. Synthetic biology is not a homogeneous discipline, it is a fuse of approaches that draw on synthetic chemistry, genetic engineering, and bioinformatics. The engineering of metabolic pathways, which allows to use bacteria and other microorganisms to produce chemicals, plays an important role in SB breakthroughs. Chemical synthesis of DNA, which allows a synthesized DNA to be inserted into an existing organism, is another important area of synthetic biology. The presentations that explained various types and flavors of synthetic biology talked about cells, pathways, chassis, microbes, reproduction, and evolution; they were colorful and full of exciting possibilities. We talked about promises of synthetic biology a lot, but I don’t think that science necessarily needs promises to justify its existence. As someone pointed out, science is a quest for knowledge, it should be interesting and exciting as such. I’m not sure science is a pure quest for knowledge, considering the convergences between science, technology, and industry. Nevertheless, I completely agree that it is exciting to learn about the world even if it's not clear whether this knowledge has applications.

Forms of communication and public dialog. Previous debates, such as the mad cow disease or GMO debate, and the resulting negative reactions demonstrate the importance of transparency in public communication of science. Early public engagement is seen as a way to improve understanding and acceptance of technology. On the other hand, the goal is not simply to promote public understanding and acceptance of technoscience, but rather to let voices of the public contribute to decision-making and regulatory frameworks. Many forms of public engagement, including polls, surveys, citizen panels, public discussions, and so on, have been promoted in the EU, and the results seem to indicate that even though not many people have heard about synthetic biology, many see continuities with previous scientific advancements and technologies and are willing to consider both positive and negative aspects of it.

Even from the short overview above it is obvious that there is a great diversity in the issues surrounding synthetic biology and approaches to their evaluation. Can they be integrated or synthesized? My own suggestion is to take a problem- rather than a debate-oriented approach and look for solutions to specific problems, while avoiding taking things for granted. Everyone has their interests and values and even the best intentions may result in bad outcomes. To use M. Foucault’s approach, we need to examine the order of things and the complex arrangements of what’s visible and hidden and what or who is included and excluded.

It was a week of stimulating discussions. The atmosphere was very friendly and collegial, and the disagreements were often phrased as humorous, slightly sarcastic remarks over dinner or drinks. My take-away from this summer school is that interdisciplinary dialog is possible, necessary, and fruitful. It works provided that we have ample time to interact and go beyond formalities (i.e., beyond formal presentations and opinion polls). The school has ended, but the work continues. We will revise our papers based on collective feedback, and they will become chapters in a forthcoming book.

See also:

May 9, 2014

Big data report from the White House

Another big data review, this time from the White House - "Big Data: Seizing Opportunities, Preserving Values" (pdf). The report explains what big data is (large, diverse, complex, longitudinal, distributed, making possible unexpected discoveries and creating an asymmetry of power between those who hold the data and those who intentionally or inadvertently supply it) and describes implications of big data for public and private sectors. In addition to many known and less known examples of how big data can be good or bad, the report provides initial thoughts on recommendations for big data governance. It divided its approach to policy framework into four overlapping core areas:

1. Big data and citizens - improve public services while preventing the government from accruing unlimited power by using increased surveillance, algorithmic profiling, and metadata tracking.

2. Big data and consumers - reduce cost of commercial services and personalize them while mitigating security breaches and risks of discrimination based on consumer profiles and lack of consumer awareness and data transparency.

3. Big data and discrimination - do less harm and prevent discriminatory uses of identification and re-identification techniques.

4. Big data and privacy - get used to less privacy while reconsidering the notice and consent framework.

In the concluding section the report had the following recommendations:

  • Advance the Consumer Privacy Bill of Rights.
  • Pass National Data Breach Legislation.
  • Extend Privacy Protections to non-U.S. Persons.
  • Ensure Data Collected on Students in School is Used for Educational Purposes.
  • Expand Technical Expertise to Stop Discrimination.
  • Amend the Electronic Communications Privacy Act.

It's a thorough report and is definitely worth a read, but similarly to my and my colleagues big data review (pre-print), it's just the beginning of studying implications and governance of big data.

May 2, 2014

Summary of drivers and barriers in data sharing

Nice summary of the drivers, barriers, and enablers that determine stakeholder engagement based on expert interviews in Dallmeier-Tiessen et al., 2014, Enabling Sharing and Reuse of Scientific Data (restricted access).

Drivers and benefits

  • Societal benefits - economic/commercial benefits; continued education; inspiring the young; allowing the exploitation of the cognitive surplus in society; better quality decision making in government and commerce; citizens being able to hold governments to accountable.
  • Academic benefits - the integrity of science; increased public understanding of science.
  • Research benefits - validation of scientific results by other scientists; recognition of their contribution; reuse of data in meta-studies to find hidden effects/trends; testing new theories against past data; doing new science not considered when data was collected without repeating the experiment; easing discovery of data by searching/mining across large datasets with benefits of scale; easing discovery and understanding of data across disciplines to promote interdisciplinary studies; combining with other data (new or archived) in the light of new ideas.
  • Organizational benefits - publication of high quality data and citation of data enhance organizational profile; preserved data linked to published articles adds value to the product; data preservation is more business; reputation of institution as “data holder with expert support” is increased; combining data from multiple sources helps to make policy decisions; reuse of data instead of new data collection reduces time and cost to new research results; use of data for teaching purposes.
  • Individual contributor benefits - preserving data for the contributor to access later — sharing with your future self; peer visibility and increased respect achieved through publications and citation; increased research funding; when more established in their careers through increased control of organizational resources; the socio-economic impact of their research (e.g., spin-out companies, patent licenses, inspiring legislation); status, promotion and pay increase with career advancement; status conferring awards and honors.

Barriers and Enablers are Related to:

  • Individual contributor incentives
  • Availability of a sustainable preservation infrastructure
  • Trustworthiness of the data, data usability, pre-archive activities
  • Data discovery
  • Academic defensiveness
  • Finance
  • Subject anonymity and personal data confidentiality
  • Legislation/regulation

Apr 15, 2014

Survey of digital curation and curators

I am conducting a survey of digital curation and digital curators.

If you are involved in taking care of digital materials of any type, form and purpose and are interested in the advancement of digital curation as a professional field, feel free to take the survey and share it with colleagues. The survey takes about 20 min to complete and can be found at http://bit.ly/1osgTQ7

Feb 12, 2014

Kinds of data and their implications for data re-use

Notes on a paper "What Are Data? The Many Kinds of Data and Their Implications for Data Re-Use" (Journal of Computer-Mediated Communication, 2007, Vol. 12, N 2, pp. 635–651):

The paper reports on an ethnographic research of data sharing practices in four projects that served as case studies. The goal - to reflect on the technical and social contexts of enabling data for re-use. The four projects were:

  • SkyProject – a collaborative project to build the data grid infrastructure for U.K. astronomy
  • SurveyProject – a project to produce a yearly large-scale complex survey dataset (~10,000 U.K. households)
  • CurationProject – a digitization and access project of artifacts and photographs collected since 1884
  • AnthroProject – a digitization of anthropological materials collected in a range of countries over one researcher's academic career

The fieldwork included interviews of participants recruited via snowballing technique and via paths that the data took through each project plus document analysis and observation (including project websites, conference and other face-to-face meetings). Participants included people involved in data collection, processing, analysis and reuse.

Below are some observations on the data practices at four stages of data lifecycle:

  • Data Collection: some disciplines produce digital data, while others (e.g., AnthroProject) work with a mix of digital, non-digital and legacy data (tapes, diaries, photographs, etc.). The labor of digitization is often ignored, but it's still very important at the stage of data collection.
  • Data Formatting: data need to be transformed (converted, re-formatted, flattened, etc.) to be re-usable by others. In SurveyProject, for example, variables are renamed and recoded, files are renamed and loaded into a database. The processes of converting variables (e.g., words into numbers) and of successive renaming and restructuring make collected materials visible, manageable, communicable, and intelligible for others. Such transformations into manageable and communicable chunks are difficult for disciplines that see their primary goal as describing the specificities of particular contexts and drawing distinctions as opposed to generalizations and simplification.
  • Data Release: ownership, consent, and ethics differ depending on whether people are represented in the source data or not. In AnthroProject, where the point of the data is their subject-specificity, anonymization is largely impossible to achieve.
  • Data Re-Use: the case studies suggested that histories and configurations of research communities influence how data are documented, contextualized and checked for quality. Overall, the following aspects are important for data to become re-usable: conditions and context of data capture (e.g., atmospheric conditions or community place-time); instrument quality and calibration techniques; data points and variables to be collected; transormation techniques (e.g., statistical methods and parameters)

This study introduced some interesting observations, but there were not enough details to make them meaningful. With such lack of details and specificity, it's hard to be convinced in the claims that are being made. For example, what questions were asked during interviews? Were there any differences between responses and information in documents, meetings, etc.? Could we make comparisons across all four projects for each stage of data lifecycle? Are quotes mere illustrations or they are indicative of some patterns?