スケーラビリティとパフォーマンス

ストーリー

佐

佐藤CTO

ユーザー数が10倍になったらどうする？

あなた

サーバーのスペックを上げます……？

あ

佐

佐藤CTO

それはスケールアップだ。確かに手っ取り早い。でも、スペックには上限がある。100倍、1000倍になったときはどうする？

あなた

複数のサーバーに分散する……スケールアウトですね

あ

佐

佐藤CTO

そう。スケーラビリティとパフォーマンスは、アーキテクチャの根幹を決める要素だ。今回は、その設計手法を体系的に学ぼう

スケーリングの2つの方向

垂直スケーリング（スケールアップ）

既存のサーバーのリソース（CPU、メモリ、ディスク）を増強する方法です。

graph LR
    A["Server<br/>CPU: 4 core<br/>RAM: 8 GB<br/>Disk: 100 GB<br/><i>Before</i>"] -->|"スケールアップ"| B["Server<br/>CPU: 32 core<br/>RAM: 128 GB<br/>Disk: 2 TB SSD<br/><i>After</i>"]

    classDef beforeStyle fill:#e8a838,stroke:#b07c1e,color:#fff
    classDef afterStyle fill:#5cb85c,stroke:#3d8b3d,color:#fff

    class A beforeStyle
    class B afterStyle

メリット	デメリット
実装がシンプル	ハードウェアの上限がある
アプリケーション変更不要	コスト効率が悪い（指数的に高くなる）
データ一貫性の維持が容易	単一障害点のまま

水平スケーリング（スケールアウト）

サーバーの台数を増やして負荷を分散する方法です。

graph LR
    LB["Load Balancer"] --> S1["Server 1"]
    LB --> S2["Server 2"]
    LB --> S3["Server 3"]

    classDef lbStyle fill:#e8a838,stroke:#b07c1e,color:#fff
    classDef serverStyle fill:#4a90d9,stroke:#2c5f8a,color:#fff

    class LB lbStyle
    class S1,S2,S3 serverStyle

メリット	デメリット
理論上無限にスケール可能	アプリケーションの設計変更が必要
冗長性の確保	データ一貫性の管理が複雑
コスト効率が良い	セッション管理が課題

ロードバランシング戦略

水平スケーリングの要となるのがロードバランサーです。

主要なアルゴリズム

// ロードバランシングアルゴリズムの概念実装
interface LoadBalancer {
  selectServer(request: Request): Server;
}

// ラウンドロビン：順番に振り分け
class RoundRobinBalancer implements LoadBalancer {
  private currentIndex = 0;

  selectServer(request: Request): Server {
    const server = this.servers[this.currentIndex];
    this.currentIndex = (this.currentIndex + 1) % this.servers.length;
    return server;
  }
}

// 重み付きラウンドロビン：サーバー性能に応じて振り分け
class WeightedRoundRobinBalancer implements LoadBalancer {
  // weight: 3 のサーバーは weight: 1 の3倍リクエストを受ける
  selectServer(request: Request): Server {
    // 重みに基づいた選択ロジック
    return this.selectByWeight();
  }
}

// 最小接続数：接続数が最も少ないサーバーへ
class LeastConnectionsBalancer implements LoadBalancer {
  selectServer(request: Request): Server {
    return this.servers.reduce((min, server) =>
      server.activeConnections < min.activeConnections ? server : min
    );
  }
}

アルゴリズム比較

アルゴリズム	特徴	適したケース
ラウンドロビン	均等に振り分け	サーバーが同スペックで処理時間が均一
重み付きラウンドロビン	性能差を考慮	スペックが異なるサーバー群
最小接続数	現在の負荷を考慮	処理時間にばらつきがある場合
IPハッシュ	同一IPは同一サーバー	セッション維持が必要な場合
レイテンシベース	応答速度で選択	マルチリージョン構成

データベーススケーリング

アプリケーションサーバーのスケールアウトは比較的容易ですが、データベースのスケーリングはより複雑です。

リードレプリカ

graph TD
    AppW["App Server<br/>（書き込み）"] -->|"書き込み"| Primary["Primary DB"]
    Primary -->|"レプリケーション"| R1["Replica DB 1"]
    Primary -->|"レプリケーション"| R2["Replica DB 2"]
    Primary -->|"レプリケーション"| R3["Replica DB 3"]
    AppR["App Server<br/>（読み取り）"] -->|"読み取り"| R1
    AppR -->|"読み取り"| R2
    AppR -->|"読み取り"| R3

    classDef appStyle fill:#67b7dc,stroke:#3a8ab5,color:#fff
    classDef primaryStyle fill:#e8a838,stroke:#b07c1e,color:#fff
    classDef replicaStyle fill:#5cb85c,stroke:#3d8b3d,color:#fff

    class AppW,AppR appStyle
    class Primary primaryStyle
    class R1,R2,R3 replicaStyle

// リードレプリカの使い分け
class OrderRepository {
  constructor(
    private primaryDb: DatabaseConnection,   // 書き込み用
    private replicaDb: DatabaseConnection,   // 読み取り用
  ) {}

  // 書き込みはPrimaryへ
  async createOrder(order: Order): Promise<void> {
    await this.primaryDb.query(
      'INSERT INTO orders (id, user_id, total) VALUES ($1, $2, $3)',
      [order.id, order.userId, order.total]
    );
  }

  // 読み取りはReplicaへ（レプリケーション遅延を許容できる場合）
  async findRecentOrders(userId: string): Promise<Order[]> {
    return this.replicaDb.query(
      'SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at DESC',
      [userId]
    );
  }

  // 強い一貫性が必要な読み取りはPrimaryへ
  async findOrderForPayment(orderId: string): Promise<Order> {
    return this.primaryDb.query(
      'SELECT * FROM orders WHERE id = $1 FOR UPDATE',
      [orderId]
    );
  }
}

シャーディング（水平分割）

データを複数のデータベースに分散させます。

graph LR
    Key["シャーディングキー: user_id"]
    S0["Shard 0
user_id % 3 == 0

User 0
User 3
User 6"]
    S1["Shard 1
user_id % 3 == 1

User 1
User 4
User 7"]
    S2["Shard 2
user_id % 3 == 2

User 2
User 5
User 8"]

    Key ~~~ S0 & S1 & S2

    classDef keyStyle fill:#1e293b,stroke:#475569,color:#f8fafc
    classDef shardStyle fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#1e40af

    class Key keyStyle
    class S0,S1,S2 shardStyle

// シャーディングルーターの概念
class ShardRouter {
  private shards: DatabaseConnection[];

  constructor(shards: DatabaseConnection[]) {
    this.shards = shards;
  }

  // ハッシュベースのシャーディング
  getShardForUser(userId: string): DatabaseConnection {
    const hash = this.hashFunction(userId);
    const shardIndex = hash % this.shards.length;
    return this.shards[shardIndex];
  }

  // レンジベースのシャーディング
  getShardByDateRange(date: Date): DatabaseConnection {
    const year = date.getFullYear();
    if (year <= 2024) return this.shards[0];     // 過去データ
    if (year === 2025) return this.shards[1];     // 現在データ
    return this.shards[2];                         // 将来データ
  }

  private hashFunction(key: string): number {
    // コンシステントハッシュなど
    return consistentHash(key, this.shards.length);
  }
}

パーティショニング（垂直分割）

テーブルを機能単位で分割し、別のデータベースに配置します。

graph LR
    A["User DB<br/>users<br/>profiles<br/>addresses"]
    B["Order DB<br/>orders<br/>order_items<br/>payments"]
    C["Product DB<br/>products<br/>categories<br/>reviews"]

    classDef dbStyle fill:#4a90d9,stroke:#2c5f8a,color:#fff
    class A,B,C dbStyle

キャッシング戦略

キャッシュはパフォーマンス向上の最も効果的な手段の一つです。

キャッシュの階層

graph TD
    A["CDN（Edge Cache）<br/>静的アセット、画像、APIレスポンス<br/>← 最もユーザーに近い"]
    B["Application Cache（Redis/Memcached）<br/>セッション、計算結果、一時データ<br/>← アプリケーション層"]
    C["Database Cache（Query Cache）<br/>クエリ結果、マテリアライズドビュー<br/>← データベース層"]
    D["Database（Source of Truth）<br/>← 真のデータソース"]

    A --> B --> C --> D

    classDef edgeStyle fill:#5cb85c,stroke:#3d8b3d,color:#fff
    classDef appStyle fill:#67b7dc,stroke:#3a8ab5,color:#fff
    classDef dbCacheStyle fill:#e8a838,stroke:#b07c1e,color:#fff
    classDef dbStyle fill:#d9534f,stroke:#b52b27,color:#fff

    class A edgeStyle
    class B appStyle
    class C dbCacheStyle
    class D dbStyle

キャッシュパターン

// Cache-Aside（キャッシュアサイド）パターン
class ProductService {
  constructor(
    private cache: CacheClient,
    private db: ProductRepository,
  ) {}

  async getProduct(id: string): Promise<Product> {
    // 1. キャッシュを確認
    const cached = await this.cache.get(`product:${id}`);
    if (cached) {
      return JSON.parse(cached) as Product;
    }

    // 2. キャッシュミス → DBから取得
    const product = await this.db.findById(id);
    if (!product) throw new NotFoundError(`Product ${id} not found`);

    // 3. キャッシュに保存（TTL: 5分）
    await this.cache.set(`product:${id}`, JSON.stringify(product), 300);

    return product;
  }

  async updateProduct(id: string, data: UpdateProductDTO): Promise<Product> {
    // 1. DBを更新
    const updated = await this.db.update(id, data);

    // 2. キャッシュを無効化
    await this.cache.delete(`product:${id}`);

    return updated;
  }
}

// Write-Through（ライトスルー）パターン
class UserProfileService {
  async updateProfile(userId: string, profile: Profile): Promise<void> {
    // 1. DBに書き込み
    await this.db.updateProfile(userId, profile);

    // 2. キャッシュも同時に更新（書き込みの一貫性を保証）
    await this.cache.set(
      `profile:${userId}`,
      JSON.stringify(profile),
      3600  // TTL: 1時間
    );
  }
}

キャッシュ戦略の比較

パターン	読取性能	書込性能	一貫性	適用場面
Cache-Aside	高い	変わらない	結果整合	読み取り中心のデータ
Write-Through	高い	低下	強い	一貫性が重要なデータ
Write-Behind	高い	高い	結果整合	書き込みが多いデータ
Read-Through	高い	変わらない	結果整合	シンプルな読み取りキャッシュ

パフォーマンスベンチマーク

主要なパフォーマンス指標

interface PerformanceMetrics {
  // レイテンシ（応答時間）
  latency: {
    p50: number;   // 50パーセンタイル（中央値）
    p95: number;   // 95パーセンタイル
    p99: number;   // 99パーセンタイル
    max: number;   // 最大値
  };

  // スループット
  throughput: {
    rps: number;           // 1秒あたりのリクエスト数
    concurrentUsers: number; // 同時接続ユーザー数
  };

  // リソース使用率
  resources: {
    cpuUtilization: number;    // CPU使用率（%）
    memoryUtilization: number; // メモリ使用率（%）
    diskIOPS: number;          // ディスクI/O操作数/秒
    networkBandwidth: number;  // ネットワーク帯域使用量
  };

  // エラー率
  errorRate: number;  // エラーリクエストの割合（%）
}

パフォーマンス目標の設定例

エンドポイント	P50	P95	P99	目標RPS
商品一覧API	50ms	200ms	500ms	10,000
商品検索API	100ms	300ms	800ms	5,000
注文作成API	200ms	500ms	1,000ms	1,000
ユーザー認証	100ms	200ms	400ms	3,000

キャパシティプランニング

計算手法

// キャパシティプランニングの計算例
interface CapacityPlan {
  // 現在の指標
  currentMetrics: {
    dailyActiveUsers: number;    // DAU
    peakRPS: number;             // ピーク時RPS
    avgResponseTime: number;     // 平均レスポンスタイム
    dataGrowthPerMonth: number;  // 月次データ増加量（GB）
  };

  // 成長予測
  growthFactor: number;  // 1年後の成長倍率

  // 必要リソース計算
  calculateRequiredResources(): ResourcePlan;
}

// 計算例
const plan = {
  currentMetrics: {
    dailyActiveUsers: 100_000,
    peakRPS: 5_000,
    avgResponseTime: 150,  // ms
    dataGrowthPerMonth: 50, // GB
  },
  growthFactor: 3,  // 1年後に3倍

  calculateRequiredResources() {
    const futureRPS = this.currentMetrics.peakRPS * this.growthFactor;
    // 安全係数1.5を掛ける
    const targetRPS = futureRPS * 1.5;

    // 1台あたり500RPSを処理可能と仮定
    const serversNeeded = Math.ceil(targetRPS / 500);

    // データベースストレージ
    const storageNeeded =
      this.currentMetrics.dataGrowthPerMonth * 12 * this.growthFactor;

    return {
      applicationServers: serversNeeded,  // 45台
      targetRPS,                           // 22,500 RPS
      storageGB: storageNeeded,            // 1,800 GB
    };
  },
};

スケーリングの法則

アムダールの法則（Amdahl’s Law）

並列化によるスピードアップの理論上の上限を示します。

スピードアップ = 1 / ((1 - P) + P/N)

P = 並列化可能な処理の割合
N = プロセッサ数（サーバー数）

並列化可能割合（P）	2台	4台	8台	16台	無限台
50%	1.33x	1.60x	1.78x	1.88x	2.00x
75%	1.60x	2.29x	2.91x	3.37x	4.00x
90%	1.82x	3.08x	4.71x	6.40x	10.00x
95%	1.90x	3.48x	5.93x	9.14x	20.00x

「この表を見ると、並列化できない部分がボトルネックになることが一目瞭然だ」と佐藤CTO。「並列化可能割合が50%だと、どれだけサーバーを増やしても2倍以上にはならない。だからこそ、逐次処理のボトルネックを特定して解消することが重要なんだ」

ユニバーサルスケーラビリティ法則（USL）

アムダールの法則に**競合（コンテンション）**の要素を加えたモデルです。

C(N) = N / (1 + α(N - 1) + β * N * (N - 1))

N = ノード数
α = コンテンション（直列化による遅延）係数
β = コヒーレンシ（ノード間通信による遅延）係数

重要な示唆：

α が大きい → ロック競合、共有リソースの競合がボトルネック
β が大きい → ノード間の同期通信がボトルネック
β > 0 の場合、ノードを増やしすぎると逆に性能が低下する

graph LR
    A["ノード数: 少"] -->|"性能向上"| B["★ 最適なノード数"]
    B -->|"性能低下"| C["ノード数: 多"]

    classDef lowStyle fill:#fef3c7,stroke:#d97706,stroke-width:2px,color:#92400e
    classDef optStyle fill:#d1fae5,stroke:#059669,color:#065f46
    classDef highStyle fill:#fee2e2,stroke:#dc2626,color:#991b1b

    class A lowStyle
    class B optStyle
    class C highStyle

アーキテクチャ設計パターン

読み書き分離（CQRS簡易版）

graph LR
    CMD["Command<br/>（書込み）"] --> WA["Write API"] --> PDB["Primary DB"]
    PDB -->|"レプリケーション"| RR["Read Replica<br/>+ Cache Layer"]
    QRY["Query<br/>（読取り）"] --> RA["Read API"] --> RR

    classDef cmdStyle fill:#e8a838,stroke:#b07c1e,color:#fff
    classDef qryStyle fill:#5cb85c,stroke:#3d8b3d,color:#fff
    classDef apiStyle fill:#67b7dc,stroke:#3a8ab5,color:#fff
    classDef dbStyle fill:#4a90d9,stroke:#2c5f8a,color:#fff

    class CMD cmdStyle
    class QRY qryStyle
    class WA,RA apiStyle
    class PDB,RR dbStyle

バックプレッシャーパターン

// メッセージキューを使ったバックプレッシャー
// 処理能力を超えるリクエストをキューに溜めて流量制御

// 受付側：リクエストをキューに投入
class OrderAcceptor {
  async acceptOrder(order: OrderRequest): Promise<string> {
    const jobId = generateId();
    await this.queue.publish('orders', {
      jobId,
      order,
      timestamp: new Date(),
    });
    // すぐにジョブIDを返す（非同期処理）
    return jobId;
  }
}

// 処理側：自分のペースで消費
class OrderProcessor {
  async processQueue(): Promise<void> {
    // コンシューマーの処理能力に合わせて消費
    const messages = await this.queue.consume('orders', {
      batchSize: 10,       // 一度に10件
      pollingInterval: 1000, // 1秒ごとにポーリング
    });

    for (const msg of messages) {
      await this.processOrder(msg.order);
      await this.queue.ack(msg);
    }
  }
}

まとめ

ポイント	内容
垂直 vs 水平スケーリング	垂直は手軽だが上限あり、水平は無限だが設計の複雑性が増す
ロードバランシング	アルゴリズムの選択がパフォーマンスに直結する
DBスケーリング	リードレプリカ、シャーディング、パーティショニングの使い分け
キャッシュ戦略	階層的にキャッシュを配置し、整合性とのバランスを取る
スケーリングの法則	アムダールの法則・USLで理論的な限界を理解する
キャパシティプランニング	成長予測に基づき安全係数を掛けてリソースを計画する

チェックリスト

垂直・水平スケーリングのメリット・デメリットを説明できる
ロードバランシングの主要なアルゴリズムを3つ以上挙げられる
DBスケーリングの手法（レプリカ、シャーディング）を説明できる
キャッシュパターンの使い分けを理解した
アムダールの法則とUSLの示唆を説明できる

次のステップへ

スケーラビリティとパフォーマンスの設計手法を学びました。次は「可用性と耐障害性」を学びます。システムが落ちないためにはどう設計すべきか、具体的なパターンを見ていきましょう。

推定読了時間: 30分