You are here : Home > Safeguarding the data used to train artificial neural networks

News

Safeguarding the data used to train artificial neural networks


​Artificial neural networks, a form of artificial intelligence, can be trained efficiently using collaborative learning, where the training data comes from multiple private sources. Exchanging the data does raise privacy concerns, however. CEA-List has come up with a secure method for the collaborative construction of deep neural networks.

Published on 5 October 2021

It takes vast amounts of data, sometimes from multiple sources, to train artificial neural networks. During the learning and operating, or inference, phases, the privacy of the training data (which can include sensitive material like patient medical files) is potentially at risk.

CEA-List researchers developed SPEED (Secure PrivatE and Efficient Deep learning), a privacy-by-construction learning method that could protect sensitive data during both phases. SPEED’s three pillars are:

Share as little data as possible. With SPEED, only encrypted labels are exchanged between the contributors. This ensures that the data is secure during distributed learning on training data from a variety of contributors.  

Make the network impossible to reverse engineer. The constructed network must be impossible for users to reverse engineer. SPEED’s differential privacy process limits the risk of users being able to reconstruct the original data by observing the network, at a negligible computational cost.

Shield the network from server-level threats. To keep the risk of exposing training data to a minimum, server-level threats have to be reduced. Even better is to eliminate using a trusted third party. Using homomorphic encryption (HE), the aggregation server can “blind” process the encrypted labels only, without ever “seeing” the data.

This research was published in top journal Machine Learning and has been accepted by the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD'21), a major machine learning event.


Top page