Game of Gradients: Mitigating Irrelevant clients in Federated Learning

AAAI Conference on Artificial Intelligence

0x00 Abstract

Though FL paradigm has received significant interest recently from the research community, the problem of selecting the relevant clients w.r.t. the central server’s learning objective is under-explored. We refer to these problems as Federated Relevant Client Selection (FRCS). Because the server doesn’t have explicit control over the nature of data possessed by each client, the problem of selecting relevant clients is significantly complex in FL settings.

The problems of FRCS need to be resolved:

selecting clients with relevant data
detecting clients that possess data relevant to a particular target label
rectifying corrupted data samples of individual clients.

We follow a principled approach to address the above FRCS problems and develop a new federated learning method using the Shapley value concept from cooperative game theory. Towards this end, we propose a cooperative game involving the gradients shared by the clients. Using this game, we compute Shapley values of clients and then present Shapley value based Federated Averaging (S-FedAvg) algorithm that empowers the server to select relevant clients with high probability.

Key Words:Relevant data,Shapley value,Federated Learning

0x01 Introduction

FL setting assumes a syn- chronous update of the central model and the following steps proceed in each round of the learning process (McMahan et al. 2017).

Note that only a fraction of clients is selected in each round as adding more clients would lead to diminishing returns beyond a certain point.

Though initially the emphasis was on mobile- centric FL applications involving thousands of clients, recently there is significant interest in enterprise driven FL applications that involve only a few tens of clients.

Motivation:

We apply standard FedAvg algorithm to two cases:

(a) where all clients possess relevant data

(b) where some clients possess irrelevant data.

从MNIST建立针对偶数学习的模型。

NOTE：将奇数标签改成偶数！

To simulate irrelevance, we work with open-set label noise (Wang et al. 2018) strategy, wherein we randomly flip each odd label of the 4 irrelevant clients to one of the even labels.

Contributions of Our Work:

将夏普利值作为衡量设备数据相关性的度量
基于夏普利值魔改FedAvg算法(S-FedAvg)
解决了两个FRCS问题：
- 选择高质量数据的节点
- 检测和纠正节点的异样数据

0x03 Problem Statement

无关数据的存在使模型准确性和稳定性都下降！

0x04 Proposed Solution

博弈基础

夏普利值定义如下：

$SV_i(v)=\frac{1}{|N|!}\sum_{\pi \in \Pi}[v(C_{\pi}(i)\cup\{i\})-v(C_{\pi}(i))]$

其中: $v(\cdot)$ 是特征函数（贡献值函数）； $\Pi$ 是 $N$ 个参与者全排列（入场顺序）； $C_{\pi}(i)=\{j\in\pi:\pi(j)<\pi(i)\}$ , $\pi(j)$ 是 $j$ 在排列 $\pi$ 中的位置。

通俗理解 $C_{\pi}(i)$ :就是 $i$ 还没入场前的合作联盟。

考虑全部参与者皆会入场（全排列），上式改下如下：

$SV_i(v)=\sum_{C \in A-\{i\}}\frac{|C|!(|N|-|C|-1)!}{|N|!}[v(C(i)\cup\{i\})-v(C(i))]$

公式注解：

此时 $C$ 表示的是一个联盟，而不是一个排列。

一个联盟就会存在入场顺序，此处就是用联盟遍历排列的过程。

假设 $A=\{1,2,3,4,5,6\}$ ,取 $i=1,C=\{2,3\}$ ，就会存在以下排列：

$\{2,3,1,4,5,6\}\\ \{2,3,1,4,6,5\}\\ ...\\ \{3,2,1,4,5,6\}\\ \{3,2,1,4,6,5\}\\ ...\\$

由于 $i$ 在上述排列的过程中，都是第三个入场的，所以 $v(C(i)\cup\{i\})-v(C(i))$ 值是相同的。

相同个数是 $|C|!(|N|-|C|-1)!$ ： $\{2,3\}$ 在 $i$ 左边，其余在右边的情况总数。

具体算法：

实验中， $T=100,m=5$ ； $\varphi_k$ 可以使用多阶马尔科夫计算。

0x05 Experiments

1.Setup

Extreme 1-class-non-iid

As is typical with the federated learning setting, we assume that the data is distributed non-iid with each client. We follow the extreme 1-class-non-iid approach mentioned in (Zhao et al. 2018b) while distributing the data to clients.