Data-driven approaches are increasingly valuable as our ability to store massive amounts of it, the computational power to crunch through it, and the advanced analytics to make sense of it have come to maturity. These opportunities have led to the development of major facilities for aggregating, analyzing, and monetizing data from industrial sources. But the promise of Big Data, machine learning, and data analytics is predicated on access to data. This article delves into four distinct but somewhat overlapping challenges at play in terms of access to data: ownership of data, data nationalism, cybersecurity, and data privacy.
It was late at night when a team member called: Everything had gone black at the major industrial installation we had been monitoring. The facility’s owner had brought us in to evaluate advanced analytics for some critical hardware. We contacted the facility and determined the cause—our system had been disconnected. We were dead in the water.
Was the equipment damaged? Was there a mistake? Turns out that the equipment manufacturer had been at the plant for some upgrades and had disconnected these systems. A closer look at the contract indicated that the equipment manufacturer was acting within their rights of the agreement. Such a scenario can happen at any plant with any service provider as data is more valuable than ever before.
Navigating the data economy in the social media and consumer world is now a topic of intense discussion and even regulation, but it is an emerging challenge in the industrial sector. Airplanes, power plants, electric grids, factories, and other industrial facilities and machinery are using—and pro-ducing—more and more data, and the engineers who operate this equipment are finding they are spending valuable time figuring out how to balance the competing claims of ownership over industrial data.
Decades ago, manufacturers of heavy industrial equipment made the majority of their money on the supply side—selling turbines, aircraft engines, locomotives, and other big, physical assets. A combination of maturing economies that don’t require as much equipment, greater competition from deregulation, and international suppliers have driven prices to the point where new strategies are needed to remain competitive. As a result, there has been a significant shift in business models from the manufacturers to services, maintenance, and overhaul that will be needed over the life of the asset.
Today these trends are meshing with the growing ability to extract valuable data and insights from operating industrial assets. The declining cost of sensors and data storage, coupled with growing computing power and analytic tools, opens up new insights into how facilities operate, when they be done to improve performance and drive down associated operational costs. But this firehose of data spewing from every device presents a choice for individuals, companies, and even nations. Is it ultimately more valuable to horde data and control it, even if that leads to lost opportunities for creating synergies? Or should access be more open, even at the risk of letting more nimble competitors make better use of data you collected? To what extent should regulators intervene in this space, such as by compelling companies to share aspects of this data, or enforcing strict controls over data security in matters influencing delivery of critical services to the public sector, such as water or electric power?
Data-driven approaches are increasingly valuable as our ability to store massive amounts of it, the computational power to crunch through it, and the advanced analytics to make sense of it have come to maturity And these insights are worth big money. In their 2012 white paper, “Industrial Internet: Pushing the Boundaries of Minds and Machines,” Peter C. Evans and Marco Annunziata noted that even a 1 percent improvement in efficiency in the industrial sector could lead to enormous savings. Their calculations suggested that, over a 15 year period, the savings were $27 billion in the rail sector, $30 billion in aviation, $66 billion in gas-fired power generation, and $90 billion in oil and gas. The entity that provides the expert advice to the plant operator can certainly expect to charge some fraction of this value.
These opportunities have led to the development of major facilities for aggregating, analyzing, and— hopefully—monetizing data from industrial sources. One of the largest such facilities in the world is operated by General Electric just outside of Atlanta. This facility reads in 200 billion data points per second of live-streamed data from more than 5,000 power generating facilities from over 60 countries, analyzing it for trends that can be used to detect anomalies, increase reliability, and improve efficiency. Mitsubishi Hitachi Power Systems operates its own remote monitoring center in Orlando, Fla., that similarly streams data 24/7 from plants worldwide.
It may not be possible to completely prevent hostile actors from accessing critical industrial infrastructure.
But the promise of Big Data, machine learning, and data analytics is predicated on access to data. There are four distinct, but somewhat overlapping challenges at play in terms of access to data: ownership of data, data nationalism, cybersecurity, and data privacy.
Data Ownership—the question of what company or individual owns data—has been, and continues to be, an issue for the consumer internet. To more firmly establish personal data rights, the Federal Communication Commission has required internet service providers to obtain permission of their customers to collect and use personal information (web browsing history, app usage, health and financial information, and geolocation information).
The industrial internet faces somewhat different circumstances. Here data ownership is typically determined through contractual negotiations between private companies. When data was sparse, and often not even used when available, the terms and conditions regarding data exchange and storage were minor issues. Now that data is more valuable, we are seeing the ownership of data become more contested.
The fuzzy property right to data has led to the sharp-elbowed encounter described in the opening paragraph, and a variety of actions to protect data, to get access to data, and even actions from policymakers to “democratize” data. Similarly, it has led to a variety of companies developing new and creative business models that give them access to data that can be monetized. Google’s Nest brand of smart home thermostats is a prime example. In addition to being a thermostat that does a lot of cool stuff, like reducing power needs and wasted energy, it also generates a lot of data about energy patterns—data that can be monetized by enabling insights into how to further improve energy performance or to suggest time for needed maintenance.
Data Nationalism refers to actions that a country may take to keep data within its boundaries. This is harder than it sounds. When a passenger departs Atlanta Hartsfield-Jackson airport on a Delta Airlines flight on a Boeing airplane equipped with GE engines, and flies to Beijing Capital International Airport in China, there are multiple overlapping claims to the data generated during the flight. Who owns the data? Boeing? Delta? GE? China?
Some of the data rights are straightforwardly outlined in pre-determined contractual arrangements. But in many cases, there is ambiguity. And there are indications that data nationalism is expanding with a growing number of countries imposing restrictions on the cross-border flow of data. Just as countries would not allow the export of some valuable mineral or other asset without compensation, so they are realizing that the data generated in-country has great value.
As a result, some companies have had their remote monitoring programs abruptly terminated. Others have faced demands that they locate data centers and hire data analysts locally.
These interventions have the potential to undermine the benefits of significant parts of the value creation from industrial data that comes from aggregation, across multiple units, with lots of operational histories. Mass isolation of data for any reason hinders the ability of data analytics to uncover subtle trends and further develop predictive capabilities. But the interests of the developing world, where much of the most significant growth for power plants and other infrastructure is occurring, must also be considered. These regions certainly would like the value creation and ancillary services that come from the data generated by these facilities to lie within the country, rather than just being sold back to them.
A related issue is CYBERSECURITY. In an ideal world, distributed assets, such as an electricity distribution system, would be connected to a larger network, enabling experts to look for potential problematic behaviors or to identify opportunities to increase efficiency or reduce costs. Such data would then be aggregated across a large number of locations in order to build the databases needed to sleuth out more subtle or lower probability issues.
However, such access creates vulnerabilities. While many of the ongoing cyberattacks on critical infrastructure are classified, a number of incidents have made it into the open press. One of the most high profile was the 2015 hacking of the Ukrainian power grid which led to widespread power outages and has been attributed to state-sponsored actors in Russia. Similarly, the Department of Homeland Security and the FBI issued a public alert in 2018 on Russian government cyber actors who gained remote access into US energy networks.
Indeed, it seems increasingly clear that it may not be possible to completely prevent suitably resourced hostile actors from accessing critical industrial infrastructure, leading to discussions of how to limit or manage risk. Such issues will be an inherent limiter of the ability to access data from industrial facilities.
Finally, DATA PRIVACY issues come up when data sets come from individual consumers, such as those who own internet-connected cars or whose electricity use is monitored via smart meters. Connected devices promise to provide better services to consum-ers—smart meters, for instance, offer great potential to reduce electricity bills for homeowners, or to improve the reliability of the electricity system.
Sharing this data with companies has a downside, especially if it is unsecured. Personal energy use patterns could conceivably be available to others. With the data from smart meters, robbers could look for patterns of energy use to identify predictable times that a homeowner was away from their house. These privacy concerns have led to public backlashes against companies harvesting data without their consent Privacy issues take a different tenor altogether when consumers purchase these systems themselves and opt-in. The Nest smart home appliances generate a lot of data, for instance, so the company has striven to be transparent in its data privacy policies.
Data privacy is an issue where national government policy interventions have occurred and will likely be extended in the future. The European Union has promulgated the General Data Protection Regulation, or GDPR, whose overall goal is to give primary control of personal data to the individuals from where that data originated. (Tying back to the data nationalism issue described above, the GDPR also addresses the export of personal data.) IPs likely that the world may be divided into regions with stiff rules on data privacy, such as Europe, and those with less regulation, such as China.
The combination of Big Data and advanced analytics is powerful. There are some exciting ideas being envisioned. For example, organizations such as the Electric Power Research Institute are demonstrating how the integration of data across the energy, communications, water, and security industries can lead to new cost savings, better reliability, and decreased environmental impacts. Similarly, Georgia Tech has a focused strategic initiative called "Energy in the Information Age,” to direct researchers towards the problems at the intersection of physical energy infrastructure and the data world. The promise of the connected internet of industrial things is that every event that spawns data becomes an opportunity for every company to learn best practices.
But just at the time that the internet promises democratization of information, it’s not inconceivable that the opposite will happen. The new monopolies will not be in delivery of commodities like oil, steel, or sugar, but in ownership and access to critical data.
It’s too soon to say which direction the industrial data economy will take: open and connected, or closed and monopolistic. But that choice will determine the way companies working in the industrial sector will look like—and the business models they utilize—for decades to come.