"Big Data" generally refers to “data too big to be handled and analyzed by traditional database protocols."[1] As a subject for academic inquiry, the majority of research has been published in the fields of computer science, statistics and econometrics. However, as organizations and public relations (PR) practitioners consider the ethical issues surrounding Big Data, it is rapidly gaining interest among scholars in the business, communications, and legal disciplines.[2]

The Origin of Big Data

While the term "Big Data" appeared in the popular press[3] as early as the 1980s, its use was not intended to convey the large scale data systems that we associate with the Big Data of today. It was not until the 1990s that the phrase became associated with the computational power that enables organizations to obtain unprecedented amounts of information about any number of stakeholders. In this context, the first use of the term has been credited[4] to California-based Silicon Graphics International (SGI). The company featured Big Data prominently in its advertising activities and as a session theme in technical seminars.[5]

Scholarly Inquiry in Communications

As a subject for ethical inquiry, big data has received considerable attention from communications scholars danah boyd and Kate Crawford. According to boyd and Crawford, Big Data is "less about data that is big than it is about a capacity to search, aggregate, and cross-reference large data sets" (p. 663). In this way, some of the data encompassed by Big Data (e.g. all Twitter messages about a particular topic) may not be as large as earlier sets of data (e.g. census results) that were not viewed as Big Data. The authors define Big Data as a cultural, technological, and scholarly phenomenon that rests on the interplay of three elements:
  1. Technology: maximizing computational power and algorithmic accuracy to gather, analyze, link and compare large data sets
  2. Analysis: drawing on large data sets to identify patterns in order to make economic, social, technical, and legal claims, and
  3. Mythology: the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy.[6]
While boyd in particular has been gaining prominence as a presenteron Big Data ethics within the business and scholarly communities, O’Reilly Mediahas also taken a step toward bringing the subject to mainstream audiences. In Ethics of Big Data, published by O'Reilly in September 2012, authors Kord Davis and Doug Patterson provide examples of both real-life and potential ethical issues for organizations that use Big Data. While Davis and Patterson do not address PR practitioners specifically, their examples have obvious implications for PR professionals as decision makers and spokespersons.

Ethical Issues

Davis and Patterson (2012, pp. 7-8)[7] provide several examples of ethical issues relating to Big Data, each of which would typically require an exercise in ethical decision-making by an organization's PR practitioners:

  1. The ethics of setting insurance rates based on browser history; e.g. increasing life insurance rates based on visits to websites that provide information on chest pain
  2. Providing a detailed record of a vehicle's GPS history, including locations and speed, to third parties
  3. The use of genetic information to influence hiring practices
  4. Retrieval of information about a person, including criminal records and browsing history, based on a photo snapped with a mobile phone while logged into a “dating” app, and
  5. Targeting marketing initiatives based on purchase history.
While the final example in this list may appear to pose the least serious ethical dilemma for PR practitioners, a recent case study offers a different perspective:

Case Study: The Pregnancy Predictor

Numerous retailers employ big data technologies to analyze individual buying behaviour and personalize marketing efforts accordingly. Such was the case when US-based retailer Target determined, through its “pregnancy predictor” algorithm (a pregnancy prediction "score" based on buying behaviour), that one of its teen shoppers was likely within her first 20 weeks of pregnancy. Based on this determination, the retailer used addressed ad mail to promote maternity clothing and nursery furniture to the teen consumer. When the first direct mail piece arrived at the teen’s home, it was received by her father who, unaware of the fact that his daughter was pregnant, proceeded to the local Target store where he accused the retailer of promoting teen pregnancy. The manager examined the addressed ad mail piece, apologized to the father and, a few days later, called to apologize again. However, the father did not respond with the same outrage that the manager had witnessed during their previous exchange. In an article for the New York Times, Charles Duhigg[8] paraphrased the father’s response to the manager: “I had a talk with my daughter,’” he said. “It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.”

Ethical Framework

Each of the above examples calls for a moral judgment by decision makers within the organization. In the absence of a formal ethical framework for the use of big data, employees may end up reverting to their own moral codes. This can present internal conflict when one employee's "creepy" technology is another's "good business."

It is generally accepted that ethical online public relations assumes a respect for user privacy and security of data. As Hallahan[9] notes, both are vital to safeguarding the confidentiality of personally identifiable information about users. However, in a cross-disciplinary management team, all members may not view the ethical issues surrounding Big Data in the same way.

Davis and Patterson propose a four-question framework as a first step toward establishing common ground when discussing Big Data issues:
  1. Identity: How do employees view offline vs. online existence? Are they the same or different?
  2. Privacy: Who should control access to data in the organization?
  3. Ownership: Who owns data, can we transfer the rights to it, and what are the obligations of the people who generate and use the data?
  4. Reputation: Do employees understand that every digital conversation and interaction has an impact on the organization's ability to manage reputation?
The authors contend that these questions can help guide organizations in reducing their value conflicts and working toward the establishment of Big Data policies. While privacy policy is one potential outcome of Big Data ethical inquiry, it is not the only one. Others include policies surrounding data ownership and new media "cybernetics" (i.e. data trails that show and predict user content preferences, such as Amazon.com's well known "suggestions" program).[10] From a structural perspective, organizations have also begun to reflect the increasing role of Big Data through Big Data vice-presidents and other Big Data leadership positions.

Future Considerations

Writing in 2006, Hallahan[11] stated that, within the public relations profession, only two small groups focus specifically on technology-related issues: (1) the Public Relations Society of America (PRSA) technology section, and (2) the then-new International Association of Online Communicators. As Big Data continues to generate ethical debate among public relations practitioners and scholars, it will be important to examine the manner in which professional associations respond.
  1. ^ Davis, K., & Patterson, D. (2012). Ethics of big data [Mobi edition]. Retrieved from oreilly.com
  2. ^
    Davis, K., & Patterson, D. (2012). Ethics of big data [Mobi edition]. Retrieved from oreilly.com
  3. ^
    Larson, E. (1989, July 27). They're making a list: Data companies and the pigeonholing of America. The Washington Post.Retrieved from http://www.highbeam.com/doc/1P2-1203414.html
  4. ^ Diebold, F.X. (2012). On the origin(s) and development of the term, “big data”. Manuscript, Department of Economics, University of Pennsylvania. Retrieved from http://www.ssc.upenn.edu/~fdiebold/papers/paper112/Diebold_Big_Data.pdf
  5. ^ Mashey, J.R. (1998, April 25). Big data and the next wave of infrastress [PowerPoint slides]. Retrieved from http://static.usenix.org/event/usenix99/invited_talks/mashey.pdf
  6. ^ Boyd, d., & Crawford, K. (2012). Critical questions for big data. Information, Communication & Society, 15(5), 662-679.
  7. ^
    Davis, K., & Patterson, D. (2012). Ethics of big data [Mobi edition]. Retrieved from oreilly.com
  8. ^
    Duhigg, C. (2012, February 16). How companies learn your secrets. The New York Times. Retrieved from http://www.nytimes.com/2012/02/19/magazine/shopping-habits.html
  9. ^
    Hallahan, K. (2006). Responsible online communication. In K. Fitzpatrick & C. Bronstein (Eds.), Ethics in public relations: Responsible advocacy (Ch. 7) [Kindle edition]. Retrieved from http://amazon.com
  10. ^
    Gordon, A.D., Kittross, J.M., Merrill, J.C., Babcock, W., & Dorsher, M. (2011) Controversies in media ethics (3rd ed.). New York: Routledge.
  11. ^
    Hallahan, K. (2006). Responsible online communication. In K. Fitzpatrick & C. Bronstein (Eds.), Ethics in public relations: Responsible advocacy (Ch. 7) [Kindle edition]. Retrieved from http://amazon.com