バイアス緩和手法 - L0 カリキュラム

ストーリー

田

田中VPoE

バイアスの種類と公平性指標を学んだ。次は具体的な緩和手法だ

あなた

バイアスを完全に取り除くことはできるんですか？

あ

田

田中VPoE

完全にゼロにすることは難しい。だが、許容範囲に抑えることはできる。緩和手法は大きく3つの段階がある。前処理（データ段階）、学習中（モデル段階）、後処理（出力段階）だ

あなた

パイプラインの各段階でアプローチが違うんですね

あ

バイアス緩和の3段階

データ収集 → 前処理 → モデル学習 → 後処理 → 出力
             │          │          │
             ▼          ▼          ▼
         前処理手法    学習中手法    後処理手法
         (Pre)      (In)        (Post)

1. 前処理アプローチ（Pre-processing）

学習データの段階でバイアスを緩和します。

リサンプリング

class BiasResampler:
    """データのリサンプリングでバイアスを緩和"""

    def oversample_minority(self, data, group_col, target_col):
        """少数派グループをオーバーサンプリング"""
        groups = data.groupby(group_col)
        max_size = max(len(g) for _, g in groups)

        balanced = []
        for name, group in groups:
            if len(group) < max_size:
                # 少数派をアップサンプリング
                upsampled = group.sample(
                    n=max_size, replace=True, random_state=42
                )
                balanced.append(upsampled)
            else:
                balanced.append(group)

        return pd.concat(balanced).reset_index(drop=True)

    def undersample_majority(self, data, group_col, target_col):
        """多数派グループをアンダーサンプリング"""
        groups = data.groupby(group_col)
        min_size = min(len(g) for _, g in groups)

        balanced = []
        for name, group in groups:
            downsampled = group.sample(n=min_size, random_state=42)
            balanced.append(downsampled)

        return pd.concat(balanced).reset_index(drop=True)

データ拡張

class BiasAugmenter:
    """データ拡張によるバイアス緩和"""

    def augment_text_data(self, texts, labels, group_labels):
        """テキストデータのバイアス緩和拡張"""
        augmented_texts = []
        augmented_labels = []

        for text, label, group in zip(texts, labels, group_labels):
            augmented_texts.append(text)
            augmented_labels.append(label)

            # 少数派グループのデータを拡張
            if self._is_underrepresented(group):
                # 同義語置換
                aug_text = self._synonym_replacement(text)
                augmented_texts.append(aug_text)
                augmented_labels.append(label)

                # バックトランスレーション
                aug_text = self._back_translate(text)
                augmented_texts.append(aug_text)
                augmented_labels.append(label)

        return augmented_texts, augmented_labels

特徴量の中立化

def neutralize_features(data, protected_attribute, features):
    """保護属性の影響を特徴量から除去する"""
    from sklearn.linear_model import LinearRegression

    neutralized = data.copy()
    for feature in features:
        # 各特徴量から保護属性の影響を回帰で除去
        model = LinearRegression()
        model.fit(
            data[[protected_attribute]],
            data[feature],
        )
        predicted = model.predict(data[[protected_attribute]])
        neutralized[feature] = data[feature] - predicted

    return neutralized

2. 学習中アプローチ（In-processing）

モデルの学習プロセスにおいてバイアスを緩和します。

公平性制約付き学習

class FairnessConstrainedModel:
    """公平性制約を損失関数に追加"""

    def __init__(self, base_model, fairness_weight=0.5):
        self.model = base_model
        self.fairness_weight = fairness_weight

    def loss_function(self, predictions, targets, groups):
        """タスク損失 + 公平性ペナルティ"""
        # 通常のタスク損失
        task_loss = self._cross_entropy(predictions, targets)

        # 公平性ペナルティ（Demographic Parityの違反度）
        group_rates = {}
        for pred, group in zip(predictions, groups):
            if group not in group_rates:
                group_rates[group] = []
            group_rates[group].append(pred)

        rates = [sum(r) / len(r) for r in group_rates.values()]
        fairness_penalty = max(rates) - min(rates)

        # 合計損失
        total_loss = (
            (1 - self.fairness_weight) * task_loss
            + self.fairness_weight * fairness_penalty
        )
        return total_loss

敵対的デバイアシング

Generator（メインモデル）
  ├─ タスクの予測精度を最大化
  └─ 保護属性の予測精度を最小化（敵対的）

Adversary（敵対ネットワーク）
  └─ メインモデルの出力から保護属性を推定

学習プロセス:
  1. Generatorがタスク予測を実行
  2. AdversaryがGeneratorの出力から保護属性を推定
  3. Generatorは、タスク精度を維持しつつ
     Adversaryが保護属性を推定できないように学習

→ 結果: 保護属性に依存しない予測モデルが得られる

プロンプトベースのデバイアシング（LLM向け）

def debiased_prompt(user_query: str) -> str:
    """LLMのバイアスを軽減するプロンプト設計"""
    return f"""
    以下の質問に回答してください。

    回答にあたっての注意事項:
    - 性別、年齢、人種などの属性に基づくステレオタイプを避けてください
    - 特定のグループを優遇・差別する表現を使わないでください
    - 多様な背景を持つユーザーに対して公平な回答をしてください
    - 統計データを引用する場合は、ソースを明示してください

    質問: {user_query}
    """

3. 後処理アプローチ（Post-processing）

モデルの出力を調整してバイアスを緩和します。

しきい値調整

class ThresholdOptimizer:
    """グループごとにしきい値を調整して公平性を達成"""

    def optimize(self, scores, groups, actuals, metric="equal_opportunity"):
        """公平性指標を満たすしきい値を探索"""
        best_thresholds = {}
        unique_groups = set(groups)

        for group in unique_groups:
            group_mask = [g == group for g in groups]
            group_scores = [s for s, m in zip(scores, group_mask) if m]
            group_actuals = [a for a, m in zip(actuals, group_mask) if m]

            best_threshold = 0.5
            best_metric_value = float("inf")

            for threshold in [i / 100 for i in range(1, 100)]:
                preds = [1 if s >= threshold else 0 for s in group_scores]
                metric_value = self._calculate_metric(
                    preds, group_actuals, metric
                )
                if metric_value < best_metric_value:
                    best_metric_value = metric_value
                    best_threshold = threshold

            best_thresholds[group] = best_threshold

        return best_thresholds

出力の多様性確保（レコメンド向け）

class DiversityReranker:
    """レコメンド結果の多様性を確保するリランカー"""

    def rerank(
        self, recommendations: list, diversity_weight: float = 0.3
    ) -> list:
        """関連度と多様性のバランスを取ったリランキング"""
        reranked = []
        remaining = list(recommendations)
        categories_shown = set()

        while remaining and len(reranked) < len(recommendations):
            best_score = -1
            best_idx = 0

            for idx, item in enumerate(remaining):
                relevance = item["score"]
                # 未表示カテゴリにはボーナス
                diversity_bonus = (
                    diversity_weight
                    if item["category"] not in categories_shown
                    else 0
                )
                combined = relevance + diversity_bonus

                if combined > best_score:
                    best_score = combined
                    best_idx = idx

            selected = remaining.pop(best_idx)
            reranked.append(selected)
            categories_shown.add(selected["category"])

        return reranked

手法の比較と選定ガイド

手法	段階	メリット	デメリット	適した場面
リサンプリング	前処理	シンプル、モデル非依存	情報の損失/過剰適合	データ不均衡が主因
特徴量中立化	前処理	代理変数の影響を除去	有用な情報も失う	代理変数が特定できる
公平性制約学習	学習中	精度と公平性のバランス	実装が複雑	カスタムモデル
プロンプトデバイアス	学習中	LLMに簡単に適用	効果が限定的	LLMベースシステム
しきい値調整	後処理	モデル変更不要	グループ定義が必要	二値分類
多様性リランキング	後処理	実装が容易	関連度が低下する	レコメンド

まとめ

段階	主な手法	ポイント
前処理	リサンプリング、データ拡張、特徴量中立化	データの偏りを修正
学習中	公平性制約学習、敵対的デバイアシング	モデルが公平性を学習
後処理	しきい値調整、多様性リランキング	出力を調整して公平化

チェックリスト

前処理・学習中・後処理の3段階のアプローチを説明できる
リサンプリングと特徴量中立化の実装方法を理解した
プロンプトベースのデバイアシングをLLMに適用できる
多様性リランキングの仕組みを把握した

推定所要時間: 30分