EU policy/Consultation on a European Strategy for data

 Home    About    Statement    Monitoring    Documentation    Handouts    Team    Transparency


Consultation on a European Strategy for Data

2020

 

This page contains the questions from the public consultation by the European Commission on its European Data Strategy. It is intended as a working documents for Wikimedians to collaboratively draft Wikimedia's answers to this legislative initiative of the EU.

The EU's survey will remain open until 31 May 2020, but we will take input into account that has been added here until 16 May 2020.

It is linked to a parallel consultation by the European Commission on Artificial Intelligence, as such systems require massive amounts of information.

Introduction edit

Europe is undergoing a digital transition that is changing our societies and economies at an unprecedented speed. Data is at the core of this transformation. It has an impact on all economic sectors and also on the daily lives of citizens.

The aim of the European strategy for data is to create a single European data space: a genuine single market for data, where personal as well as non-personal data, including confidential data, are secure. This will make it easier for businesses and public authorities to access an almost infinite amount of high-quality data to boost growth and create value, while reducing the carbon footprint of the EU economy.

To fulfil this ambition, the EU can build its single market for data on a strong legal framework in terms of data protection, freedom to provide services and of establishment, fundamental rights, safety and cyber-security – and this will be further stimulated by a large degree of interconnection in digital public services. In addition, the EU has a strong industrial base and a recognised technological capacity to build safe and reliable complex products and services, from aeronautics to energy, automotive, medical equipment and digital.

The Commission is putting forward a European data strategy that benefits society and the entire European digital economy. It puts the citizen at the centre of the data-driven economy while ensuring that European companies and public authorities can capitalise on the data they generate and also have better access to the data generated by others.

This public consultation will help shape the future policy agenda on the EU data economy. It will feed into possible Commission initiatives on access to and re-use of data.

It is structured in two sections. The objective of Section 1 is to collect views on the data strategy as a whole. Section 2 is divided into sub-sections. It aims to collect information on three specific aspects announced in the data strategy:

  1. how data governance mechanisms and structures can best maximise the social and economic benefits of data usage in the EU
  2. the EU-wide list of high-value datasets that the Commission is to draw up under the recently adopted Open Data Directive
  3. the role of self-regulation to implement rules on data processing

Submission edit

The answers and documents submitted:

  1. Survey Answers
  2. Position on High-Value Datasets
  3. Position on Data Trustees

Section 1: General questions on the data strategy edit

Do you agree that the European Union needs an overarching data strategy to enable the digital transformation of the society?
  • Yes
  • No
“More data should be available for the common good, for example for improving mobility, delivering personalised medicine, reducing energy consumption and making our society greener.” To what extent do you agree with this statement?
  • Strongly agree ← This is the answer best aligned with a movement for free knowledge…
  • Somewhat agree
  • Neutral
  • Somewhat disagree
  • Strongly disagree
  • I don’t know / no opinion

… but let's notice the mention of "the common good", which is as critical as it is ambiguous. I think we don't want, or wouldn't benefit from, the disclosure of data that can threaten our privacy or our individual or collective security. In particular:

  • Data already protected by the General Data Protection Regulation (GDPR), in case we agree with it.
  • Data that can be used to cause significant harm: data on unresolved software/hardware vulnerabilities, data on how to make weapons (soon, ordinary people could be making firearms with 3D printers or even genetically modifying viruses or bacteria in their homes), etc.
  • Poor quality data (incomplete, outdated, misleading data), whose use will lead to wrong decisions and actions. By saying that more data should be available we don't mean that we support the spread of misinformation.
  • Data whose publication leaves European companies at such a significant competitive disadvantage that the existence of a successful business/industrial/commercial network in Europe becomes impossible, an effect that might not be considered "the common good". This is a delicate point because it may come into apparent conflict with our purpose as a global movement for free knowledge, so we should determine where the threshold is.

What is "the common good"? Who defines it? How?

Do you think that it should be made easier for individuals to give access to existing data held about them, e.g. by online platform providers, car manufacturers, producers of wearables, voice assistants or smart home appliances, to new services providers of their choosing, in line with the GDPR?
  • Yes ← But I don't know what "it should be made easier" means. For example, I may not like a platform to integrate, collect or redirect my personal data from various providers to make it "easier" for me to access the data.
  • No
‘General data literacy across the EU population is currently insufficient for everyone to benefit from data-driven innovation and to become more active agents in the data economy.’ To what extent do you agree with this statement?
  • Strongly agree
  • Somewhat agree ← Just my opinion, please tell me if you want me to elaborate.
  • Neutral
  • Somewhat disagree
  • Strongly disagree
  • I don’t know / no opinion
‘The EU should make major investments in technologies and infrastructures that enhance data access and use, while giving individuals as well as public and private organisations full control over the data they generate.’ To what extent do you agree with this statement?
  • Strongly agree
  • Somewhat agree ← The EU will have to invest, but investment is no guarantee of success, nor does more investment guarantee better results. In some cases the investment will not be the key; the willingness to reach agreements, to implement the right processes, to listen to Wikimedia :-) and to many other stakeholders… will be.
  • Neutral
  • Somewhat disagree
  • Strongly disagree
  • I don’t know / no opinion
‘The development of common European data spaces should be supported by the EU in strategic industry sectors and domains of public interest (industry/manufacturing, Green Deal, mobility, health, finance, energy, agriculture, public administration, skills).’ To what extent do you agree with this statement?
  • Strongly agree
  • Somewhat agree ← I would choose "Strongly agree", but I find the question a little ambiguous. Data spaces should be "supported"… in what way? We need European spaces. Why not all economy?
  • Neutral
  • Somewhat disagree
  • Strongly disagree
  • I don’t know / no opinion

Section 2.1 - Specific questions on future actions: Data governance edit

The use of data in the society and the economy raises a series of questions of legal, ethical, organisational and technical nature. Many angles need to be looked at in order to fully reap the benefits of the use of data without harm.

With the term ‘data governance’ we seek to refer to the set of legal, organisational and technical rules, tools and processes that determine the use of data by the public sector, business, individuals, civil society organisations, researchers.

This may translate into establishing mechanisms for data governance at European level which may support data-driven innovation in different ways:

   At cross-sector level, it could identify the need for standards to facilitate data-sharing, including for the various actions to be taken in this regard (identification, authentication, access control). It could identify use cases in which cross-sector data re-use is supported by standardisation. It could provide technical guidance on technologies for lawful processing of data in accordance with data protection legislation, the need to protect of commercially sensitive information as well as competition law.
   At sector-specific level, data governance could be developed, building on existing structures and coordination mechanisms.
   
‘Data governance mechanisms are needed to capture the enormous potential of data in particular for cross-sector data use.' To what extent do you agree with this statement?
  • Strongly agree ← Just my opinion.
  • Somewhat agree
  • Neutral
  • Somewhat disagree
  • Strongly disagree
  • I don’t know / no opinion
Public authorities should do more to make available a broader range of sensitive data for R&I purposes for the public interest, in full respect of data protection rights.' To what extent do you agree with this statement?
  • Strongly agree
  • Somewhat agree ← Available to whom? What does "a broader range of sensitive data" mean? How can sensitive data be made available "in full respect of data protection rights"? Under what technical privacy rules? Differential privacy? k-anonymity? Or the original data but only available to a few authorized persons?
  • Neutral
  • Somewhat disagree
  • Strongly disagree
  • I don’t know / no opinion
Do you think that law and technology should enable citizens to make available their data for the public interest, without any direct reward?
  • Yes
  • No
  • I don't know / no opinion

This opens up obvious possibilities, but also a number of problems. For instance:

  • We have to make sure that people are aware of the practical consequences and risks of donating their data when they're offered to do so. Although "the consent to the processing of such data" can "be withdrawn", people should keep in mind that their data may never disappear completely, even in cases where the law requires otherwise.
  • In the context of a study (demographic, academic, public health…), if the sample has not been designed following strict statistical criteria, but only according to the people who have decided to donate their data, the results can be biased, misleading and invalid. These datasets should be distinguishable from the rest and everyone who uses these datasets should be aware of this limitation.
  • If donations are not supported by processes that ensure the data is reliable and up-to-date, the donations can be counterproductive because, again, their use can lead to wrong decisions and actions.
  • Ethical dilemmas related to the value of personal data and to the social and institutional legitimization of these practices. Just to mention one, everyone's personal data (an asset that even the poorest people have) would lose value, including those who don't donate their data.

I don't think Wikimedia should take a categorical affirmative or negative position on this.

Do you think there are sufficient tools and mechanisms to “donate” your data?
  • Yes
  • No
  • I don't know / no opinion ← I don't think Wikimedia should take a categorical affirmative or negative position on this.
‘Such intermediaries [data marketplaces/brokers] are useful enablers of the data economy.’ To what extent do you agree with this statement?
  • Strongly agree
  • Somewhat agree
  • Neutral
  • Somewhat disagree ← This question is controversial. Probably, as a free knowledge movement, we prefer data to be freely accessible than to be bought and sold almost clandestinely by some companies, but we can't say that buying and selling data is bad by definition, and other answers with proper justifications would seem equally valid to me. By the way, I wouldn't say that data marketplaces are "novel"…
  • Strongly disagree
  • I don’t know / no opinion

Section 2.2 - Specific questions on future actions: identification of high-value datasets edit

The recently adopted Directive 2019/1024/EU (Open Data Directive) introduces the concept of high-value datasets (HVDs), defined as documents the re-use of which is associated with important benefits for society and the economy (e.g. job creation, new digital services, more efficient and evidence-based policy making). Under the directive, the Commission is required to adopt an implementing act setting out a list of specific high value datasets within the thematic categories listed in Annex I to the directive (geospatial; earth observation and environment; meteorological; statistics; companies and company ownership; mobility). The directive specifies that those datasets shall be made available for re-use free of charge, in machine-readable formats, provided via application programming interfaces (APIs) and, where relevant, as bulk download.

The answers to the questions below will help the Commission draw up an EU-wide list of specific high-value datasets.

"The establishment of a list of high-value datasets, to be made available free of charge, without restrictions and via APIs, is a good way to ensure that public sector data has a positive impact on the EU's economy and society." To what extent do you agree with this statement?
  • Strongly agree
  • Somewhat agree ← Only if, once again, these datasets are kept up to date and, in general, their quality is guaranteed.
  • Neutral
  • Somewhat disagree
  • Strongly disagree
  • I don’t know / no opinion
Apart from the potential to generate socio-economic benefits, please indicate the relevance of the following additional factors to be taken into account when selecting datasets for the future list of high value datasets
For each please indicate: Very relevant; Relevant; Neutral; Not relevant; Not relevant at all; I don't know / no opinion
  • The re-use of the dataset would increase if it was provided free of charge. Relevant / Very relevant
  • The dataset belongs to a thematic area in which there are few EU-level requirements for opening up data. Neutral / Relevant
  • The re-use of the dataset would increase if its availability under uniform conditions was ensured across the entire EU. Relevant / Very relevant
  • The re-use of the dataset would increase if it was available via an application programming interface (API). Relevant / Very relevant
  • If other factors: please specify
Please indicate the relevance of each of the other arrangements indicated below to improve the re-usability of specific high-value datasets.
For each please indicate: Very relevant; Relevant; Neutral; Not relevant; Not relevant at all; I don't know / no opinion
  • Licensing and other terms applicable to re-use. Relevant / Very relevant
  • Standardised formats of data and metadata. Very relevant
  • Possibility of user feedback. Relevant / Very relevant
  • Specific technical arrangements for dissemination. I don't know / no opinion, too ambiguous
  • If other arrangements, please specify:
EU programmes may provide funding to enhance the availability and re-use of high-value datasets across Europe. For each of the following activities, please indicate how relevant it is to support them.
For each please indicate: Very relevant; Relevant; Neutral; Not relevant; Not relevant at all; I don't know / no opinion
  • Improving the quality (e.g. machine-readability) and interoperability of the data/metadata. Relevant / Very relevant
  • Ensuring sustainable data provision via application programming interfaces (APIs). Relevant / Very relevant
  • Engaging with re-users (promoting the data, co-defining use-cases). Relevant / Very relevant
  • If other activities, please specify:
According to your experience and the expected potential of concrete datasets, indicate up to three specific datasets that should be listed in each of the thematic categories of high-value datasets, as referred to in Article 13(1) of the Open Data Directive
  • Geospatial:
  • Earth observation and environment:
  • Meteorological:
  • Statistics:
  • Companies and company ownership:
  • Mobility:

Unfortunately, I have to save this question for when I have some more time. --abián 14:50, 4 May 2020 (UTC)[reply]