FEDERATED MACHINE LEARNING — A METHOD OF TRAINING AI MODELS

Swapnil Saurav
17 min readNov 26, 2023

--

Federated machine learning is an emerging approach in the field of machine learning that allows for training models on decentralized data sources without the need to share or centralize the data. Let’s explore this in today’s edition.

Traditional Model

In recent years, we have witnessed exponential growth in data generation and an increasing need for sophisticated machine-learning models to extract valuable insights. Artificial Intelligence and Machine Learning offer the ability to learn from experience and refine models without being specifically programmed. These innovations have become common use in recent years: they have been used in various fields and applications.

For example, analysis of participants’ data from a vehicle fleet in the automotive industry provides insights for customer needs, vehicle activities and driving environments. New vehicle data analysis strategies include sending compressed sensor data from all vehicles to a central server and conducting the data analysis operations.

The entire dataset is available for training the model, and the training process occurs on the centralized server. The data is usually collected from multiple sources, but it is combined and stored in one place before model training takes place.

Traditional machine learning models require the data to be centralized in a single location or server. However, this progress has been impeded by legitimate concerns regarding privacy, security, and data ownership.

Challenges with Traditional Model

  1. Data Privacy: Centralizing the data raises concerns about privacy as all the data is stored and accessed in one location. It may involve sensitive information, and there is a risk of unauthorized access or data breaches.
  2. Data Transfer: In some cases, transferring the entire dataset to a central server can be time-consuming, resource-intensive, and costly, particularly if the data is large or distributed across various locations.
  3. Legal Compliance: Centralizing data may also raise legal and compliance issues, especially if the data represents individuals or entities from multiple jurisdictions with different data protection regulations.

Distributed Machine Learning — one possible solution

One solution is to use a distributed machine learning model.

Distributed machine learning algorithms are designed to resolve the computational problems of complex algorithms on large scale datasets. The distributed machine learning algorithms are more efficient and scalable than centralized algorithms.

Here, the distributed models are trained with same methodology as in centralized machine learning models, except they are trained separately on multiple participants.

During the training in a distributed algorithm, the participants independently train their models and send the weight updates to the central server. At the same time, the central server receives updates from participants and performs averaging for output. After certain communication rounds, the convergence testing is done on the central cloud server.

Federated Machine Learning

Enter federated machine learning — a revolutionary approach combining collaborative learning and data privacy, ushering in a new era of machine learning applications.

Federated machine learning is a system that enables multiple entities to collaboratively train a machine learning model while keeping their data decentralized and secure. Instead of sending all the data to a centralized server, the data remains on local devices or servers, and only the model parameters are exchanged during the learning process.

The only difference between distributed machine learning and federated learning is that in federated learning, each participant initializes the training independently, as there is no other participant in the network. In federated learning, individual participants conduct the training collaboratively and independently. In particular, local epochs are declared in the learning parameters and each participant train the data by running the local epochs. After specific amounts of epochs, the local update is computed, and the participants send the updates to the cloud server. The cloud server receives the update from each participant, average them and aggregate the next global model. Based on this global model, the participants execute the training process for the next communication round. The process repeats until the desired convergence level is achieved or the communication rounds are complete.

Federated machine learning overcomes the limitations by distributing the model training process across multiple devices or edge devices while keeping the data locally. In this approach:

  1. Data Privacy: Data remains on the local devices and is not shared with the central server or other participants, ensuring privacy and confidentiality.
  2. Data Ownership: Users or entities retain ownership of their data and have control over its usage, reducing concerns about relinquishing control of sensitive information.
  3. Data Storage and Transfer: The need to transfer large amounts of data to a centralized location is minimized or eliminated. Only model updates or gradients are shared between devices and the central server, reducing the burden of data transfer.
  4. Legal Compliance: Federated learning can be designed in a privacy-preserving manner that aligns with data protection regulations and complies with legal requirements without centralized data storage.

Advantages of Federated Machine Learning

The concept of federated machine learning addresses the challenges of data privacy and security by minimizing data exposure. As the data remains on the local devices, the risk of data breaches or unauthorized access is significantly reduced. Moreover, federated learning eliminates the need for data aggregation, thereby reducing the vulnerabilities associated with centralized data repositories.

Another significant advantage of federated machine learning is its ability to leverage distributed data sources. In conventional machine learning, a central server typically trains models on large, homogeneous datasets. However, in federated learning, each participating entity retains ownership of their data, enabling a more diverse and representative training sample. This decentralized approach enhances the robustness and generalizability of the trained models, enabling them to perform better in real-world scenarios.

Moreover, federated machine learning promotes scalability and efficiency. By distributing the computational load across multiple devices, federated learning enables parallel processing, leading to faster convergence and reduced training time. This aspect is particularly valuable when dealing with large-scale datasets, where centralization often becomes a bottleneck.

Federated machine learning is highly relevant in scenarios where data cannot be easily shared due to regulatory constraints, contractual agreements, or intellectual property concerns. Industries such as healthcare, finance, and telecommunications stand to benefit immensely from federated learning, as it allows different entities (hospitals, banks, etc.) to collaborate without compromising sensitive information while still extracting meaningful insights.

Techniques used by FML

Technique 1: Secure Aggregation

Secure Aggregation

This technique ensures that the model updates sent by participants are combined without revealing individual contributions, thus preserving data privacy. It allows for the aggregation of locally trained models or gradients without exposing any sensitive information or raw data to the centralized server.

The secure aggregation process typically involves the following steps like:

  1. Local model training: Each device or client trains a local model using its own data. This can be done using various machine-learning algorithms and techniques.
  2. Encryption: After training the local model, each device encrypts its model parameters or gradients using encryption techniques like homomorphic encryption or secure multi-party computation. This ensures that the model updates cannot be directly observed or used to reconstruct the original data.
  3. Aggregation: The encrypted model updates are then sent to the central server for aggregation. The server can perform operations on the encrypted data without decrypting it, allowing the aggregation of the model updates without exposing the raw data or model parameters.
  4. Decryption: Once the aggregation is complete, the central server can decrypt the aggregated model updates using secret keys or decryption algorithms. This allows the server to obtain the final aggregated model without ever accessing the raw data or individual model updates.

Secure aggregation provides several benefits in federated learning:

  1. Privacy protection: The encryption of model updates ensures that the individual data or model information is not directly exposed to the central server. This protects the privacy of the users and their sensitive data.
  2. Security against attacks: By encrypting the model updates, secure aggregation protects against attacks that aim to extract sensitive information or manipulate the learning process.
  3. Data confidentiality: The encryption techniques used in secure aggregation ensure that the central server cannot access the raw data or model parameters, maintaining the confidentiality of the information.
  4. Compliance with regulations: Secure aggregation enables federated learning to comply with data protection regulations, such as the General Data Protection Regulation (GDPR), by minimizing the exposure of personal data to third parties.

However, secure aggregation also has some limitations. The encryption and decryption processes can introduce additional computational overhead and communications costs, which can impact the overall efficiency and scalability of federated learning systems. Therefore, it is important to carefully choose encryption algorithms and optimize the secure aggregation techniques to strike a balance between privacy and computational efficiency.

Technique 2: Differential Privacy

By adding noise to the model updates, differential privacy techniques aim to prevent the leakage of sensitive information about individual data points.

This noise makes it difficult for an attacker to determine whether a specific user’s data has been used in the training process or to extract sensitive information from the model. The level of noise added is carefully calibrated to balance privacy and utility. It ensures that the individual contributions of each user are hidden, while still allowing the model to learn from the combined knowledge of all users.

The differential privacy technique is applied to the local model updates performed by each user device. Instead of sending the raw model updates to the central server, each user device first applies a random noise to the update. This noise is generated using a privacy amplification factor, which determines the overall strength of the privacy protection.

The privacy amplification factor is a parameter that determines the amount of noise added to the local model updates. A higher value of this parameter leads to stronger privacy guarantees but may also reduce the utility of the model. A lower value, on the other hand, may improve utility but could compromise privacy.

To ensure that privacy is maintained, a trusted third party typically generates the privacy amplification factor. This third party is responsible for calibrating the noise added to the local model updates based on a pre-defined privacy budget. This budget determines how much privacy protection is provided to each user’s data. The noise added to the updates is carefully selected to ensure that the privacy budget is not exceeded.

When the centralized server receives the perturbed model updates from each user device, it aggregates the updates to create a global model. The aggregated updates are combined using secure multi-party computation techniques, which allow the server to perform the necessary calculations without knowing the individual contributions.

By applying differential privacy to federated learning, user privacy is protected while still allowing the central server to learn from the collective knowledge of all users. This technique provides a balance between privacy and utility, allowing users to participate in the training process without compromising their sensitive data.

Technique 3: Homomorphic Encryption

Homomorphic encryption is a cryptographic technique that allows computations on encrypted data without decrypting it. This means that data can remain protected and private throughout the computation process.

Combining homomorphic encryption and federated learning can provide a powerful privacy-preserving mechanism for training machine learning models on sensitive data. Here’s how it works:

  1. Data Encryption: Each decentralized device (such as a smartphone or IoT device) encrypts its data using a homomorphic encryption scheme. This ensures that the data remains confidential and cannot be accessed by the central server or any other party without the decryption key.
  2. Model Training: The central server initializes a machine learning model and sends it to the decentralized devices. The devices use the model to perform local computations on their encrypted data. These computations include operations like addition and multiplication, which are possible with homomorphic encryption.
  3. Aggregation of Encrypted Results: After performing the local computations, the decentralized devices send the encrypted results back to the central server. The server then aggregates these encrypted results to obtain a combined result without decrypting any data.
  4. Decryption and Model Update: The central server decrypts the aggregated result using the decryption key. It then uses this result to update the machine-learning model. The updated model is then sent back to the decentralized devices for subsequent iterations of the training process.

The key advantages of using homomorphic encryption in federated learning are:

  1. Privacy-preserving: Since the data remains encrypted throughout the process, the central server does not have access to the raw data. This helps protect sensitive information and ensures that user privacy is maintained.
  2. Security: Homomorphic encryption provides a strong level of security by ensuring that the data remains confidential and protected from unauthorized access. Even if the encrypted data is intercepted, it would be of no use without the decryption key.
  3. Decentralization: Federated learning allows training models on decentralized devices and the use of homomorphic encryption extends this decentralized nature to the encryption process as well. This promotes data ownership and control by the device owners while still enabling collaborative model training.

However, there are some challenges and limitations to consider:

  1. Efficiency: Homomorphic encryption is computationally expensive and can introduce significant overhead in terms of time and resources required for encryption, decryption, and computation.
  2. Limited Operations: While homomorphic encryption supports basic operations such as addition and multiplication, more complex operations required for machine learning tasks may not be fully supported. This can limit the types of computations that can be directly performed on the encrypted data.
  3. Complexity: Implementing and managing a homomorphic encryption scheme requires considerable expertise in cryptography and adds complexity to the overall federated learning workflow.

In summary, homomorphic encryption in federated learning offers a promising solution for privacy-preserving machine learning on sensitive data. It can help protect user privacy and enable collaborative training without sharing raw data. However, it also comes with challenges and limitations that must be carefully addressed for practical implementation.

Implementation Approaches for FML

Now lets talk about different approaches used for Federated learning:

Approach 1: Vertical Federated Learning

Vertical federated learning enables combining different feature sets from multiple sources, improving the model’s accuracy. Vertical federated learning is an approach used in machine learning where data from different sources, typically different organizations, is used to train a model collaboratively without sharing the underlying data. This approach is particularly useful when the data is held in a decentralized manner and cannot be easily shared due to privacy concerns or legal restrictions.

In vertical federated learning, the participating organizations maintain ownership and control of their own data while collaborating to train a shared model. This is achieved through a secure and distributed learning framework that ensures privacy and data protection. The basic idea is to train a model using data from multiple sources, without sharing the actual data, by using cryptographic techniques and secure protocols.

The process of vertical federated learning typically involves the following steps:

  1. Data partitioning: The participating organizations determine how to split their data into different subsets based on common features or attributes. Each organization retains control over its own data and shares only the relevant partition with the collaborating parties.
  2. Model initialization: A common machine learning model is initialized using a predetermined architecture and parameters. This model will be trained using the vertically federated data.
  3. Local model training: Each participating organization trains the initialized model on its own partition of data. This process is similar to traditional model training, where local data is used to update the model parameters iteratively.
  4. Secure aggregation: After local training, the model updates from each organization are securely aggregated and combined. This ensures that the model retains privacy and does not reveal any sensitive information about the individual sources.
  5. Global model update: The aggregated model updates are used to update the shared model. This step ensures that the model learns from the collective information of all participating organizations while maintaining their privacy.
  6. Model evaluation: The updated model is evaluated to measure its performance and effectiveness. This evaluation can help identify areas for further improvement or refinement.

The process of vertical federated learning can be repeated iteratively to improve the shared model over time. It allows organizations to benefit from the collective knowledge and data without compromising privacy or violating data protection regulations. This approach has applications in various domains, including healthcare, finance, and telecommunications, where sensitive data is decentralized but collaboration is required to train effective machine learning models.

Approach 2: Horizontal Federated Learning

Horizontal Federated Learning is an approach to privacy-preserving machine learning where multiple data owners collaborate to train a shared machine learning model without sharing their raw data with each other or with a central server.

In this approach, each data owner trains a local model using their own data. The local models are then aggregated to create a global model that captures the knowledge from all the local models. The global model is then used for making predictions.

The process of training the local models involves multiple iterations and communication between the data owners. In each iteration, the data owners perform the following steps:

  1. Local Model Training: Each data owner trains their local model using their own data. This can be done using any machine learning algorithm or framework, such as neural networks or decision trees. The local models aim to capture the patterns and knowledge present in the local data.
  2. Model Aggregation: The local models are then aggregated to create a global model. This can be done by averaging the weights or parameters of the local models or using more sophisticated techniques like federated averaging. The aggregation process allows the knowledge from all the local models to be combined into a single model without exposing the raw data.
  3. Model Update: The global model is then updated by incorporating the aggregated knowledge from the local models. This updated global model is then used as the starting point for the next iteration.
  4. Communication: The data owners communicate the updates to the global model without revealing their raw data. This can be done securely using encryption techniques or by using privacy-preserving algorithms like secure multi-party computation or homomorphic encryption.

Horizontal Federated Learning offers several advantages:

  1. Privacy Preservation: Since the raw data is not shared with any other party, data owners have full control over their data and maintain their privacy. This is especially important in industries such as healthcare and finance where the data is sensitive and subject to strict privacy regulations.
  2. Distributed Computation: The training process is distributed across multiple data owners, allowing them to leverage their computational resources and train models faster. This also reduces the dependence on a central server, making the approach more scalable.
  3. Data Diversity: By collaborating with multiple data owners, the global model can benefit from a diverse range of data sources. This can improve the generalization and accuracy of the model, especially in scenarios where the local data is limited or biased.

However, Horizontal Federated Learning also has some challenges:

  1. Communication Overhead: The training process requires frequent communication between the data owners, which can introduce latency and increase the network traffic. Efficient communication protocols and compression techniques need to be employed to minimize the overhead.
  2. Heterogeneous Data: Data owners may have different data distributions, formats, or quality, which can pose challenges in aggregating the local models. Techniques like data preprocessing, feature selection, or model adaptation need to be employed to handle the heterogeneity.
  3. Security Risks: The communication and aggregation processes can be prone to various security risks, such as data leakage or model poisoning attacks. Robust security mechanisms need to be implemented to mitigate these risks.

Overall, Horizontal Federated Learning is a promising approach for privacy-preserving machine learning that allows multiple data owners to collaborate and benefit from each other’s data while maintaining their privacy. Ongoing research and advancements in techniques like encryption, privacy-preserving algorithms, and model aggregation are further improving the effectiveness and practicality of this approach.

USE CASE: Developing FML Model for Hospitals

Objective:

The objective of this use case is to develop a federated machine learning system that enables hospitals from different regions to collaboratively train a disease diagnosis model without sharing raw patient data. The trained model will then be able to accurately diagnose diseases based on patient symptoms and medical records.

Architecture:

  1. Data Preparation: Each hospital collects patient data, including symptoms, medical records, and test results. This data is anonymized and stored securely within each hospital’s premises.
  2. Federated Learning Server: A central server is set up to coordinate the federated learning process. It hosts the federated learning algorithm, manages model aggregation, and ensures privacy and security of the entire process.
  3. Federated Learning Clients: Each hospital deploys a federated learning client, which runs on their local infrastructure. The client receives the model from the server, performs local model training on its own data, and sends the updated model parameters back to the server.
  4. Federated Learning Process:
  • a. Initialization: The central server initializes a global model and shares it with all federated learning clients.
  • b. Model Training: Each federated learning client trains the model on their local data using privacy-preserving techniques like differential privacy or secure multi-party computation.
  • c. Model Updates: After training, each client sends the updated model parameters to the central server.
  • d. Aggregation: The central server aggregates the updated model parameters using techniques like federated averaging.
  • e. Iteration: Steps b-d are repeated for multiple iterations to improve the model’s performance.

5. Model Deployment: After the federated learning process is completed, the trained disease diagnosis model is deployed on each hospital’s infrastructure. This allows hospitals to use the model for diagnosing diseases of new patients in a privacy-preserving manner.

Benefits

  1. Privacy Preservation: The federated learning approach ensures that raw patient data remains decentralized and does not leave the jurisdiction of each hospital. This protects patient privacy and complies with data protection regulations.
  2. Collaboration and Knowledge Sharing: Hospitals from different regions can collaborate and collectively build a powerful disease diagnosis model by sharing knowledge without sharing data.
  3. Improved Diagnosis Accuracy: By training on diverse patient data from multiple hospitals, the federated model can capture a broader range of symptoms and patterns, resulting in improved accuracy in disease diagnosis.
  4. Real-time Updates: As new data becomes available, hospitals can continuously update the federated model to reflect the latest medical knowledge and advancements.
  5. Cost Efficiency: By leveraging the shared model, hospitals can reduce individual model development costs and benefit from collective intelligence, making healthcare more affordable and accessible.

Conclusion

Federated Machine Learning for disease diagnosis is a promising approach that enables hospitals to collaboratively build accurate models without compromising patient privacy. This use case demonstrates the potential of FML in revolutionizing healthcare by combining the power of machine learning with privacy-preserving techniques.

Challenge 1:

However, like any emerging technology, federated machine learning also presents challenges that need to be addressed. One significant challenge is the issue of maintaining model accuracy when training on distributed and potentially diverse datasets. Techniques such as model averaging and secure aggregation have been proposed to tackle this challenge, but there is still ongoing research to improve the performance and convergence of federated learning algorithms.

Challenge 2

Another challenge is ensuring the security and integrity of the federated learning process. As the models and parameters are exchanged between devices, there is a risk of malicious attacks or model poisoning. Robust security protocols and techniques, such as encryption and differential privacy, must be implemented to safeguard the federated learning process against adversarial behavior.

Summary

I would like to conclude saying that the, federated machine learning represents a significant step forward in the field of machine learning, offering a solution to privacy concerns while harnessing the power of collective intelligence. Its potential applications are vast and varied, ranging from healthcare and finance to smart cities and IoT devices.

Through federated machine learning, we can achieve advancements in personalized medicine, fraud detection, predictive maintenance, and many other domains, all while respecting user privacy and data ownership. It is incumbent upon researchers, policymakers, and industry leaders to collaborate and further develop this technology, ensuring that it benefits society as a whole.

Thank you for your attention. Let us embrace this revolutionary approach, federated machine learning, and unlock its vast potential for a brighter and more secure future.

--

--

Swapnil Saurav
Swapnil Saurav

Written by Swapnil Saurav

Swapnil Saurav has more than 18 years of experience in IT industry with focus on Supply Chain Analytics and IT Service Management.

No responses yet