Sep 16, 2016

Data for humanitarian purposes

Unni Karunakara, a former president of "Doctors without borders", gave a talk at International Data Week 2016 on September 13 about the role of data in humanitarian organizations. The talk was very powerful in its simplicity and urgent need for better data and its management and dissemination. It was a story of human suffering, but also a story of care and integrity in using data to alleviate it.

Humanitarian action can be defined as moral activity grounded in the ethics of assistance to those in need. Four principles guide humanitarian action:
  • humanity (respect for the human)
  • impartiality (provide assistance because of person's need, not politics or religion
  • neutrality (tell the truth regardless of interests)
  • independence (work independently from governments, businesses, or other agencies)
These principles affect how to collect and use data and how to ensure that data helps. Data collected for humanitarian action is evidence that can be used for direct medical action and for bearing witness, which is a very important activity of humanitarian organizations:  
“We are not sure that words can always save lives, but we know that silence can certainly kill." (quoted from another MSF president)
Awareness of serious consequences of data for humanitarian action makes "Doctors without borders" work only with data they collect themselves and use stories they witnessed firsthand. Restraint and integrity in data collection is crucial in maintaining credibility of the organization.

Lack of data or lack of mechanisms to deliver necessary data hurts people. Thus, in Ebola outbreak it took the World Health Organization about 8 months to declare emergency and 3000 people died because data was not available in time or in the right form. The Infectious Diseases Data Observatory (IDDO) was created to help with tracking and researching infectious diseases by sharing data, but many ethical, legal, etc. issues still need to be solved.

Humanitarian organizations often do not have trustworthy data available, either because of competing definitions or lack of data collection systems. For example, because of the differences in defining "civilian casualty" numbers of civilians killed in drone strikes range from a hundred to thousands. Or, in developing countries or conflict zones where census activities are absent or dangerous, counting graves or tents becomes a proxy of mortality, mobility rates and other important indicators. Crude estimates then are the only available evidence.

"Doctors without borders" (MSF) does a lot to share and disseminate its information. It has an open data / access policy and aspires to share data, while placing high value on security and well-being of people it helps.


Sep 2, 2016

Workshop: Data Quality in Era of Big Data

The center where I work organizes a workshop of possible interest to many who work with data. Scholarships are available.

Data Quality in Era of Big Data
Bloomington, Indiana
28-29 September 2016


Throughout the history of modern scholarship, the exchange of scholarly data was undertaken through personal interactions among scholars or through highly curated data archives. In either case, implicit or explicit provenance mechanisms gave a relatively high degree of insurance of the quality of the data. However, the ubiquity of the web and mobile digital culture has produced disruptive new forms of data. We need to ask ourselves what we know about the data and what we can trust. Failure to answer these questions endangers the integrity of the science produced from these data.

The workshop will examine questions of quality:
·        Citizen science data
·        Health records
·        Integrity
·        Completeness; boundary conditions
·        Instrument quality
·        Data trustworthiness
·        Data provenance
·        Trust in data publishing

The 2 day workshop begins with a half day of tutorials.  The main workshop begins early afternoon on 28 September and continuing to noon on the 29 September.  With sufficient interest, there may be another training session following noon conclusion of the main workshop on 29 September.

Early Career Travel Funds:
Travel funds are available for early career researchers, scholars, and practitioners http://d2i.indiana.edu/mbdh/#scholarships

Important Dates:
·        Workshop:  Sep 28-29, 2016
·        Deadline for requesting early career travel funds:  Sep 9, 2016 midnight EDT
·        Notification of travel funding:  Sep 13, 2016
·        Registration deadline:  Sep 19, 2016
 ​
Organizing Committee:
General Chairs:  Beth Plale, Indiana University

Program Committee
Carl Lagoze, University of Michigan, chair
Devan Donaldson, Indiana University
H.V. Jagadish, University of Michigan
Xiaozhong Liu, Indiana University
Jill Minor, Indiana University
Val Pentchev, Indiana University
Hridesh Rajan, Iowa State University

Early Career Chairs
Devan Donaldson, Indiana University 
Xiaozhong Liu, Indiana University

Local Arrangements Chair
Jill Minor, Indiana University

Aug 17, 2016

SynBERC, anthropological inquiry and methods of research

Recently I've been searching for guidance on how to describe ethnographic methodology in a grant proposal and found P. Rabinow and A. Stavrianakis' commentary Movement space: Putting anthropological theory, concepts, and cases to the test, where they reflect on the challenges of anthropological inquiry, on what it means to observe in heterogeneous and changing spaces. I had no time to read it slowly and carefully, so now just filling this gap.

The essay is a response to another collection of essays, but also a reflection on previous ethnographic research with Synthetic Biology Engineering Research Center SynBERC (I wish I paid more attention to it during my own dissertation research). An honest public account like that contributes to the ethics and methodology discussions more than any published "research" article.
Raising the question of "to what end" in anthropological inquiry, Rabinow and Stavrianakis' essay recollects previous collaborative participant-observations as attempts to bring the ethics that exists outside of the instrumental rationality of science into multidisciplinary research projects.

Flourishing is the concept they used to challenge and change the currently existing relations between knowledge and care (see Rabinow, Paul. "Prosperity, Amelioration, Flourishing: From a Logic of Practical Judgment to Reconstruction." Law and Literature 21, no. 3 (2009): 301-20, jstor). Flourishing helps to examine research practices from a holistic perspective, as practices that are performed by human beings without ethical compartmentalization into scientific, individual, and citizen values.

Then the discussion moves to temporality in anthropological research and the distinction between "contemporary" and "present" in ethnography. This distinction was hard for me to understand. Observations are made in the present, but somehow contextualization with experiences from the past (history) helps to challenge the "ethnographic present". Does it mean that something (objects or practices) maybe present but not contemporary? Or that contemporary may include the past? In other words, the distinction present vs contemporary vs modern allows us to stay tuned to the constant changes and not to fix descriptions as existing in certain times only. Sometimes it seemed that contemporary referred to attempts to reconcile diverse or contradicting practices (e.g., the practices of observation and observers and the practices of the observed).

An interesting point was made on citation (again, as a response to someone else's point). It is about acknowledging more recent work on similar issues.  Understanding that that's rules of the game (esp. to get grants), Rabinow writes that excessive citation also constrains thinking and writing, and authorizes such practices. Why would someone go back to reading and citing Weber, Foucault, or Dewey? Not because what they said is still relevant and true (althrough some of it is), but because they paid so much attention to problem formation, to the need for conceptual tools, and to the importance of experimentation with form.

Back to SynBERC, it is striking how empty expressions of support from bioscientists and engineers masked indifference and ultimately lack of respect and willingness to change. What made things worse was that social scientists' effort to develop effective modes of governance and interaction were blocked and downgraded to non-action and "soothing public relations". Moreover, the social scientists themselves failed to coordinate and reflect on their complicity with dominating technoscientific norms and values.

Even though I really appreciated this account of anthropological "failure" (as seen by others, but not by the authors who conceptualize it as an "anthropological test"), there is a larger purpose in it. As the authors put it,
... it is time to thematize the new configurations of power relations in which anthropologists are working today. Critique as denunciation, still the dominant mode in anti-colonial narratives, is no longer sufficient for the complexities of contemporary inquiry. We are arguing for a more fine-grained acceptance of the fact that by refusing the binaries of inside and outside, one’s responsibility for one’s position in the field is made available for reflection and invention.
Anthropology's major task is to map heterogeneity of human and cultural forms, including:
  • cultural heterogeneity with an underlying generality (American anthropology)
  • heterogeneity within common institutional forms such as kinship and law (British anthropology)
  • variations in structural patterns of society and the mind (French anthropology)
However, accounts of heterogeneity lost their force, in some ways losing their criteria of validity under the pressure of current norms of conducting research. At the same time critical evaluation of such criteria is an important task in changing present times. Such evaluation can be done through testing - constant re-evaluation of the existing conceptual tools in the context of new situations and experiences. The rest of the detailed discussion on testing was dense, but less relevant to me, so it was also harder to follow. 

One of the take-aways is that anthropology needs to be a collaborative endeavor, where individual inquiries examine specific cases and then many inquirers create a common space of concepts, problems, and cases. The constant movement between specific cases and topology of cases creates a space where anthropology can make justifiable warrantable claims about more than one case, i.e., about heterogeneity and associated generality.


Jun 14, 2016

Mapping scientific fields, domains and specialties

I'm embarking on a new project that focuses on mapping research fields and studying the evolution of certain concepts and research communities. I have a certain field in mind that I'd like to investigate, but first I need to learn more about scientometrics and mapping of research domains. This is a first in the series of notes from my readings - a review chapter in the Annual Review of Information Science & Technology (ARIST) titled "Mapping Research Specialties"".

The chapter defines research specialty as a self-organizing network of researchers that tends to study the same research topics, attend the same conferences, publish in the same journals, and also read and cite each others’ research papers.

Other definitions of research specialties:

  • Kuhn (1970) - communities of one hundred members, sometimes less 
  • Price (1986) - an “invisible college” of approximately 100 “core” scientists, monitoring the work of individuals who are rivals and peers by reading about 100 papers for every one published
  • Lievrouw (1990) - a set of informal communication relations among scholars or researchers who share a specific common interest or goal 
  • Small (1980) - consensual structure of concepts in a field, employed through its citation and co-citation network 
  • Rogers, Dearing, and Bregman (1993) - a family tree in which earlier studies influence later studies
The term "specialties" rather than invisible college allows to avoid the assumption that the researchers are in frequent informal communication.

research specialty model
Fig. 6.2 from "Mapping research specialties"
Research specialties are therefore an interconnected group of researchers that has their own knowledge base with its own concepts, paradigms and validation standards, and uses particular channels of formal and informal communication.







Studies of research specialties are connected to the key questions raised by Chubin in his 1976 review of the field "The Conceptualization of Scientific Specialties":
  1. What are the social and intellectual properties of a specialty? 
  2. How do specialties grow, stabilize, and decline? 
  3. What are the temporal and spatial dimensions of a specialty? 
  4. How do specialties vary in size, scope, and life expectancy? 
  5. What are the institutional arrangements that support specialties? 
  6. What impact does funding have on the kind and volume of research produced in a specialty?
  7. What kinds of communication relations sustain research activities in a specialty? 
The following approaches are used in the studies of research specialties:

  1. The sociological approach (seems to be much more developed than others): science as an institution (Merton); science as a system of beliefs (Bloor, Barnes, Collins); science as culture (Latour, Woolgar, Knorr-Cetina); science as collaboration and competition (Whitley, Gibbons); science as boundary making and demarcation (Gieryn)
  2. Bibliographic or bibliometric: relevance (topics, novelty, availability, etc.); citations and co-citations; author co-citations; co-word analysis
  3. Communicative approach: knowledge diffusion through informal channels and discourses and rhetoric in science 
  4. Cognitive approach: paradigm shift (Kuhn) and branching of ideas (Mulkay)

Mapping research specialties helps to find the structure and dynamics of a research specialty and can include:

  1. A map of the network of researchers and research teams involved with the specialty.
  2. A map of the base knowledge supporting research in the specialty.
  3. A map of current research topics in the specialty.
 A map of a specialty is a representation of the structure and interconnection of known elements of the specialty, which includes research topics, teams, concepts, authorities, archival journals, research institutions, and technical vocabularies. Mapping techniques often include bibliometric methods, such as reference co-citation analysis, bibliographic coupling analysis, co-authorship analysis, author co-citation analysis, co-word analysis, paper to paper citation analysis, journal to journal citation analysis, and journal co-citation analysis.

Others goals of mapping include:

  • Mapping the social network of researchers - identify and characterize researchers and teams of researchers and their sponsoring institutions in terms of productivity, impact of research results, weak ties, levels of participation and collaborations. 
  • Mapping the base knowledge in the specialty - concepts, theories, methods, controversies
  • Mapping the topical structure 
  • Mapping the relations - researchers, concepts, and topics 
  • Mapping changes - shifts in base knowledge and topics, new subtopics, productive researchers, changes in funding
Techniques of mapping can include surveys of subject matter experts, bibliometric techniques (see above), web content analysis, and analysis of formal literature (most developed and frequently done).

The conclusion is not very optimistic though:
The problem of mapping specialties is complex and poorly defined. A number of techniques have been developed and applied. Each of these techniques reveals some separate aspect of the specialty. For example, co-authorship analysis uncovers the social structure of collaboration and research teams in the specialty, co-citation analysis uncovers structure of base knowledge in the specialty, and bibliographic coupling analysis reveals research subtopics. In and of themselves, these analytic techniques are inadequate as tools to map the whole research specialty: the social structure of researchers, the base knowledge they use, and the research topics they study. ... the metaphor of the blind men and the elephant is appropriate, as each analytic technique reveals the specialty in some limited aspect.

What is the solution for examining a specialty as a whole? Combine as many existing techniques as possible or develop some new techniques?

Jun 8, 2016

Cyberinfrastructure studies overview

In their introduction to the special issue on sociotechnical studies of cyberinfrastructure (CI) and e-research Ribes and Lee identify current themes and methodologies of CI studies (Computer Supported Cooperative Work (CSCW), 2010, Volume 19, Issue 3, pp 231-244, doi: 10.1007/s10606-010-9120-0)

Cyberinfrastructure (CI) is one of the current terms for the technologies that support scientific activities such as collaboration, data sharing and dissemination of findings. CI features that distinguish it from other CSCW work include: community wide and cross-disciplinary scope, computational orientation, and end-to-end (data-to-knowledge-to-user) integration.

Themes in CI studies:

  1. Relationality. What is supporting the work of another and who is sustaining those relationships?
  2. Integration of heterogeneity. CI involves computer specialists, data and information managers, domain scientists, and so on, but also non-human actors such as sensors and databases.
  3. Sustainability. What makes CI a long-term resource?
  4. Standardization. Ways to achieve integration on the technical and human levels.
  5. Scale. How to plan for change and growth in the number of collaborators, the quantity of data, and the geographical reach.
  6. The distribution between human work and technological delegation. 

Methods include historical, ethnographic, documentary, and interview-based approaches that focus on the following:

  • Investigations of ongoing planning, development and deployment efforts 
  • Activities of maintenance, upgrade and breakdown
  • Adoption of certain expressions of scientific activity and changes in their use
  • Adoption of new technological artifacts

Units of analysis can be a project or CI as a whole (focus on national policies and funding incentives). The introduction concludes by calling for more studies:

The stories of cyberinfrastructure are revealed by looking across multiple levels of granularity, various facets of social life, and diverse technological actors. Much remains to be studied in the areas of supporting domain specific practice, data sharing and curating, and infrastructural organizings. This is an exciting time for CI studies. Research is occurring in new and unexpected places, drawing on and bringing together the traditions of CSCW, information science, organizational studies, and science and technology studies. This cross-pollination, as exemplified by the papers in this issue, seems to be not only fruitful, but also very necessary.