Go back

Big data needs firmer foundations

   

Emerging technologies demand action from universities, funders and governments, say Mattias Björnmalm and Max Lu

It has become a cliché to call data the new oil but, like many clichés, it has a kernel of truth.

Like crude oil, raw data is of limited value. But when refined and upgraded, data can fuel everything from the frontiers of academic research to businesses across the economy. 

Increasingly, data means big datasets that are too large and complex to be dealt with using conventional approaches, and instead require specific tools and approaches for analysis and information extraction. These include data mining and machine learning, which in turn promise to drive advances in areas ranging from personalised medicine and national security to next-generation batteries and solar cells.

But, also like oil, data can lead to corruption and pollution. Data intended to be anonymous, such as patient records, can be intentionally or unintentionally identified with individuals. Data sets with inbuilt biases, in areas such as race and gender, can propagate or even worsen existing prejudice. 

And data, like oil, is unstable if treated improperly. The digital infrastructure that underpins the availability, integration and interconnectedness of large-scale data sets was not designed to handle how, increasingly, big data is generated and used, and it often lacks the sustainable funding needed to adapt. 

Combined with often outdated legislation, this provides an unstable foundation for developments in the emerging technologies that are expected to profoundly change our societies. 

Public assets

To explore how the foundations for big data can be shored up, in 2019 the university association Cesaer and the UK Royal Academy of Engineering formed a joint task force into issues relevant to key technologies such as artificial intelligence, quantum technologies and nanotechnologies. Last month, it published its statement, Key Technologies Shaping the Future: Foresight and strategic recommendations

The statement lists takeaway messages on the values and leadership needed to develop these technologies fairly and sustainably, and suggests actions for placing big data on more stable foundations aimed at universities, funders and governments.

For university leaders, the task force sets out four recommendations. First is a duty to defend scientific knowledge and technology, including data and digital assets, as public assets. This should involve retaining the rights to scientific findings and outputs, including publications, to prevent siloing and lock-in to commercial platforms. Ensuring effective security and fair value chains for the use of data is also vital. 

Second, universities must work to shape the future of the European and global data landscape, including by contributing to the European Open Science Cloud and joining the EOSC Association.

Third, they should ensure professional data support services, for example, by following the rule-of-thumb of employing at least one data steward for every 20 PhD candidates.

Fourth, universities must be at the forefront of projects that use data to improve quality of life, process efficiency and institutional strategy, without compromising privacy and data-safety standards.

For research funders, including national agencies, the first priority must be to provide sustainable funding for the professional support and infrastructures needed for long-term management and stewardship of data and digital assets. They should also support multidisciplinary projects aimed at the grand challenges of the 21st century that use and extract value from data and digital assets.

Governments and policy-makers can help these efforts by creating legal frameworks that reflect the context of public research and education, and protect the generation, sharing and preservation of scientific knowledge and technology.

Legislation should go along with a joint effort by policymakers and the scientific community to develop coherent, clear and internationally aligned strategies to support the professional management and stewardship of data and digital assets. Strategies and policies must be targeted to increase transparency and build trust in the use and deployment of big data. There is already much to build on here, such as the Fair principles, stating that digital assets should be findable, accessible, interoperable and reusable.

Our societies are at a tipping point, facing huge local and global challenges, ranging from pandemics to climate change, along with rapidly emerging and increasingly mature technologies. Complacency is not an option. 

The technologies currently emerging are likely to have profound impacts; to guide their development in a positive direction, it is vital that we have a stable foundation to build from.  

Mattias Björnmalm is deputy secretary-general of the Cesaer group of universities of science and technology. Max Lu is vice-chancellor of the University of Surrey, UK, and chair of the Cesaer/Royal Academy of Engineering taskforce on key technologies

This article also appeared in Research Europe