안녕하세요!

Federico Calò

Sviluppatore Software | Divulgatore Tecnico

Creo applicazioni web moderne e strumenti digitali personalizzati per aiutare le attività a crescere attraverso l'innovazione tecnologica. La mia passione è unire informatica ed economia per generare valore reale.

연락하기

소개

La mia passione per l'informatica è nata tra i banchi dell'Istituto Tecnico Commerciale di Maglie, dove ho scoperto il potere della programmazione e il fascino di creare soluzioni digitali. Fin da subito, ho capito che l'informatica non era solo codice, ma uno strumento straordinario per trasformare idee in realtà.

Durante gli studi superiori in Sistemi Informativi Aziendali, ho iniziato a intrecciare informatica ed economia, comprendendo come la tecnologia possa essere il motore della crescita per qualsiasi attività. Questa visione mi ha accompagnato all'Università degli Studi di Bari, dove ho conseguito la Laurea in Informatica, approfondendo le mie competenze tecniche e la mia passione per lo sviluppo software.

Oggi metto questa esperienza al servizio di imprese, professionisti e startup, creando soluzioni digitali su misura che automatizzano processi, ottimizzano risorse e aprono nuove opportunità di business. Perché la vera innovazione inizia quando la tecnologia incontra le esigenze reali delle persone.

역량

Analisi Dati & Modelli Previsionali

Trasformo i dati in insights strategici con analisi approfondite e modelli predittivi per decisioni informate

프로세스 자동화

Creo strumenti personalizzati che automatizzano operazioni ripetitive e liberano tempo per attività a valore aggiunto

맞춤 시스템

Sviluppo sistemi software su misura, dalle integrazioni tra piattaforme alle dashboard personalizzate

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

미션

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

기술의 민주화

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

IT와 비즈니스 통합

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

맞춤 솔루션

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

기술로 비즈니스를 혁신하세요

Che tu gestisca un negozio, uno studio professionale o un'azienda, posso aiutarti a sfruttare le potenzialità dell'informatica per lavorare meglio, più velocemente e in modo più intelligente.

상담하기 →

Unisciti alla Community

Entra nella community di sviluppatori dove discutiamo di software, AI, architettura e DevOps. Condividi idee, fai domande e cresci insieme a noi.

Canale

FC Dev Blog

Ricevi notifiche su nuovi articoli, serie complete, tips settimanali e tool in evidenza. Contenuti bilingui IT/EN direttamente nel tuo Telegram.

Nuovi articoli appena pubblicati
Tips e code snippets settimanali
Sondaggi sugli argomenti futuri

Iscriviti al Canale

Gruppo

FC Dev Community

Una community bilingue IT/EN per sviluppatori. Discussioni, Q&A, aiuto reciproco e networking con altri professionisti del settore.

Discussioni su articoli e tecnologie
Help coding e code review
Opportunità di lavoro e collaborazione

Unisciti al Gruppo

Topic di Discussione

Visualizza

Master SQL

RoadMap.sh

Novembre 2024

Visualizza

Oracle Certified Foundations Associate

Oracle

Ottobre 2024

Visualizza

People Leadership Credential

Connect

Settembre 2024

Linguaggi & Tecnologie

Java

Python

JavaScript

Angular

React

TypeScript

SQL

PHP

CSS/SCSS

Node.js

Docker

Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

연락하기

프로젝트가 있으신가요? 아래 양식을 작성해 주시면 빠르게 답변드리겠습니다.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

Kubernetes의 자동 크기 조정: HPA, VPA, KEDA 및 Karpenter

프로덕션 환경에서 Kubernetes의 주요 장점 중 하나는 자동 확장 기능입니다. 수요에 따라 워크로드를 처리합니다. 그러나 대부분의 팀은 분수만 사용합니다. 사용 가능한 자동 확장 기능: CPU 확장으로 HPA를 구성하고 그대로 둡니다. 결과는? 로드 시 속도가 느려지는 과소 프로비저닝된 포드 또는 이유 없이 예산을 낭비합니다.

Kubernetes는 네 가지 보완적인 수준의 자동 크기 조정을 제공합니다. HPA 규모 CPU/메모리/커스텀 측정항목의 Pod를 수평으로, VPA 고치다 자동으로 리소스 요청, 케다 이벤트 기반 확장 활성화 모든 소스(큐, 데이터베이스, Prometheus 측정항목), e 카펜터 기존 Cluster Autoscaler보다 40% 빠른 30초 이내에 노드를 프로비저닝합니다. CNCF 2025 벤치마크에 따르면. 이 문서에서는 프로덕션에서 이를 함께 사용하는 방법을 보여줍니다.

무엇을 배울 것인가

HPA(Horizontal Pod Autoscaler)가 커스텀 및 외부 측정항목과 작동하는 방식
자동 크기 조정을 위해 VPA(Vertical Pod Autoscaler) 구성
KEDA: SQS, Kafka, Redis 및 Prometheus 대기열의 이벤트 중심 자동 크기 조정
Karpenter: NodePool 및 NodeClass를 사용한 적시 노드 프로비저닝
조합 패턴: HPA와 KEDA를 충돌 없이 함께 사용
문제 해결: HPA가 예상대로 확장되지 않는 이유
플랩 루프 및 콜드 스타트를 방지하기 위한 모범 사례

수평형 포드 자동 확장 처리(HPA)

배포 복제본 수를 확장하는 HPA 및 Kubernetes 구성요소인 StatefulSet o 관찰된 측정항목을 기반으로 하는 ReplicaSet. HPA 컨트롤러는 매 측정항목을 쿼리합니다. 15초(구성 가능) 다음 공식을 사용하여 원하는 복제본 수를 계산합니다.

desiredReplicas = ceil(currentReplicas * (currentMetricValue / desiredMetricValue))

플래핑(지속적인 확장 및 축소)을 방지하기 위해 HPA에는 안정화 기간이 있습니다. 기본적으로 축소에는 5분, 확장에는 0초가 소요됩니다.

CPU 및 메모리의 HPA

CPU와 메모리를 포함한 기본 구성입니다. 메모리 규모를 확장하려면 애플리케이션이 로드가 감소하면 메모리를 확보해야 합니다. 그렇지 않으면 축소가 발생하지 않습니다.

# hpa-basic.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-server-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 60  # scala quando CPU media > 60%
    - type: Resource
      resource:
        name: memory
        target:
          type: AverageValue
          averageValue: "512Mi"  # scala quando memoria media > 512Mi
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 0  # scala su immediatamente
      policies:
        - type: Percent
          value: 100
          periodSeconds: 60  # max raddoppio delle repliche al minuto
        - type: Pods
          value: 4
          periodSeconds: 60  # o max 4 pod al minuto
      selectPolicy: Max  # usa la policy piu aggressiva
    scaleDown:
      stabilizationWindowSeconds: 300  # 5 minuti prima di scalare giu
      policies:
        - type: Percent
          value: 25
          periodSeconds: 60  # riduce max 25% delle repliche al minuto
      selectPolicy: Min

Prometheus 어댑터를 통해 사용자 정의 측정항목을 사용하는 HPA

애플리케이션 지표(초당 요청 수, 대기열 길이 등)를 확장하려면 다음이 필요합니다. Prometheus 측정항목을 사용자 정의 측정항목 Kubernetes API로 노출하는 Prometheus 어댑터:

# Installa Prometheus Adapter
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus-adapter prometheus-community/prometheus-adapter \
  --namespace monitoring \
  --set prometheus.url=http://kube-prometheus-stack-prometheus.monitoring.svc \
  --set prometheus.port=9090

# prometheus-adapter-config.yaml - regola di mapping metrica
rules:
  custom:
    - seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
      resources:
        overrides:
          namespace: {resource: "namespace"}
          pod: {resource: "pod"}
      name:
        matches: "^(.*)_total$"
        as: "${1}_per_second"
      metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'

---
# hpa-custom-metric.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-server-rps-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  minReplicas: 2
  maxReplicas: 50
  metrics:
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: "1000"  # 1000 req/s per Pod

외부 측정항목이 있는 HPA

외부 측정항목을 사용하면 길이와 같은 클러스터 외부 소스로 확장할 수 있습니다. SQS 대기열 또는 사용되지 않은 Kafka 메시지 수:

# hpa-external-metric.yaml
# Scala in base alla lunghezza di una coda SQS
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: worker-queue-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: queue-worker
  minReplicas: 1
  maxReplicas: 100
  metrics:
    - type: External
      external:
        metric:
          name: sqs_approximate_number_of_messages_visible
          selector:
            matchLabels:
              queue: "job-queue-prod"
        target:
          type: AverageValue
          averageValue: "10"  # 10 messaggi per worker

# Verifica stato HPA
kubectl get hpa -n production -w
kubectl describe hpa api-server-hpa -n production

수직형 포드 자동 확장 처리(VPA)

VPA는 Pod의 실제 CPU 및 메모리 사용량을 모니터링하고 자동으로 조정합니다. 나는 resources.requests e limits. 그리고 문제의 해결방법은 리소스 요청의 "가비지 인, 가비지 아웃": 필요한 리소스 수를 모르는 경우 Pod, VPA가 당신을 위해 알아냅니다.

VPA 및 HPA: 충돌 주의

모드에서 VPA를 사용하지 마세요 Auto CPU 또는 메모리에 대한 HPA 확장과 함께: 두 컨트롤러가 충돌합니다. 올바른 조합은 다음과 같습니다. VPA 모드 Off o Initial 리소스 요청의 경우, 복제의 경우 HPA 사용자 정의 측정항목에 대해 또는 문제를 방지하려면 HPA 대신 KEDA를 사용하십시오.

VPA 설치 및 구성

# Installa VPA
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-install.sh

# oppure con Helm
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm install vpa fairwinds-stable/vpa --namespace vpa --create-namespace

---
# vpa-recommendation.yaml - modalita Off (solo raccomandazioni)
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: api-server-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  updatePolicy:
    updateMode: "Off"  # Off|Initial|Recreate|Auto
  resourcePolicy:
    containerPolicies:
      - containerName: api-server
        minAllowed:
          cpu: "100m"
          memory: "128Mi"
        maxAllowed:
          cpu: "4"
          memory: "4Gi"
        controlledResources: ["cpu", "memory"]
        controlledValues: RequestsAndLimits

# Leggi le raccomandazioni VPA
kubectl describe vpa api-server-vpa -n production
# Output tipico:
#   Recommendation:
#     Container Recommendations:
#       Container Name: api-server
#         Lower Bound:    cpu: 100m, memory: 256Mi
#         Target:         cpu: 450m, memory: 512Mi
#         Uncapped Target: cpu: 450m, memory: 512Mi
#         Upper Bound:    cpu: 2000m, memory: 2Gi

자동 모드의 VPA

# vpa-auto.yaml - aggiorna automaticamente i resource (riavvia i Pod)
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: background-worker-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: background-worker
  updatePolicy:
    updateMode: "Auto"  # Riavvia i Pod con i nuovi resource
    minReplicas: 2      # Non aggiornare se le repliche sono meno di 2
  resourcePolicy:
    containerPolicies:
      - containerName: worker
        minAllowed:
          cpu: "200m"
          memory: "256Mi"
        maxAllowed:
          cpu: "2"
          memory: "2Gi"

KEDA: 이벤트 기반 자동 확장

KEDA(Kubernetes Event-Driven Autoscaling)는 HPA를 확장하는 CNCF 운영자입니다. 60개 이상의 사전 구축된 스케일러: AWS SQS, Azure Service Bus, Kafka, RabbitMQ, Redis, 프로메테우스, 데이터독 등. KEDA는 다음과 같은 경우 배포를 0개의 복제본으로 확장할 수 있습니다. 이벤트가 없으며 첫 번째 이벤트가 도착하면 다시 1로 설정됩니다.

KEDA 설치

# Installa KEDA via Helm
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace \
  --version 2.14.0

# Verifica
kubectl get pods -n keda

Kafka용 ScaledObject

Kafka 주제에서 소비하는 작업자는 소비자 그룹 지연을 기준으로 확장됩니다.

# keda-kafka-scaledobject.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: kafka-consumer
  pollingInterval: 15       # controlla ogni 15 secondi
  cooldownPeriod: 30        # aspetta 30s prima di scalare a 0
  minReplicaCount: 0        # scala a zero se non ci sono messaggi
  maxReplicaCount: 50
  advanced:
    restoreToOriginalReplicaCount: true
    horizontalPodAutoscalerConfig:
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 30
  triggers:
    - type: kafka
      metadata:
        bootstrapServers: kafka-broker.kafka.svc:9092
        consumerGroup: my-consumer-group
        topic: orders-topic
        lagThreshold: "100"     # 100 messaggi per replica
        offsetResetPolicy: latest
        allowIdleConsumers: "false"
        scaleToZeroOnInvalidOffset: "false"
      authenticationRef:
        name: kafka-auth  # TriggerAuthentication con credenziali Kafka

AWS SQS용 ScaledObject

# keda-sqs-scaledobject.yaml
apiVersion: v1
kind: Secret
metadata:
  name: aws-credentials
  namespace: production
data:
  AWS_ACCESS_KEY_ID: BASE64_KEY
  AWS_SECRET_ACCESS_KEY: BASE64_SECRET
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: aws-trigger-auth
  namespace: production
spec:
  secretTargetRef:
    - parameter: awsAccessKeyID
      name: aws-credentials
      key: AWS_ACCESS_KEY_ID
    - parameter: awsSecretAccessKey
      name: aws-credentials
      key: AWS_SECRET_ACCESS_KEY
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: sqs-worker-scaler
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sqs-worker
  minReplicaCount: 0
  maxReplicaCount: 100
  triggers:
    - type: aws-sqs-queue
      authenticationRef:
        name: aws-trigger-auth
      metadata:
        queueURL: https://sqs.eu-west-1.amazonaws.com/123456789/job-queue
        queueLength: "5"       # 5 messaggi per replica
        awsRegion: eu-west-1
        identityOwner: pod     # usa IRSA se disponibile

프로메테우스의 ScaledObject

# keda-prometheus-scaledobject.yaml
# Scala in base a una query Prometheus personalizzata
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: api-latency-scaler
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  minReplicaCount: 2
  maxReplicaCount: 30
  triggers:
    - type: prometheus
      metadata:
        serverAddress: http://kube-prometheus-stack-prometheus.monitoring.svc:9090
        metricName: http_request_duration_p99
        threshold: "0.5"  # scala se P99 latency > 500ms
        query: histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket{job="api-server"}[2m])) by (le))

# Verifica stato KEDA
kubectl get scaledobject -n production
kubectl describe scaledobject kafka-consumer-scaler -n production

Karpenter: 적시 노드 프로비저닝

Karpenter는 AWS와 현재 CNCF 프로젝트에서 만든 차세대 노드 프로비저너입니다. 미리 정의된 노드 그룹과 함께 작동하는 Cluster Autoscaler와 달리 Karpenter는 보류 중인 포드에 필요한 정확한 특성을 갖춘 노드를 프로비저닝합니다. 인스턴스 유형, 면적, 주문형 또는 스팟 용량, CPU/GPU. 결과: 30~60초 안에 프로비저닝 Cluster Autoscaler의 경우 3~5분이 소요됩니다.

카펜터 아키텍처

Karpenter는 Cluster Autoscaler를 완전히 대체합니다. 여기에는 두 가지 주요 CRD가 있습니다.

노드풀: Karpenter가 생성할 수 있는 노드의 요구 사항을 정의합니다(인스턴스 유형, 영역, 오염, 레이블, 제한).
NodeClass(AWS의 EC2NodeClass): 클라우드 공급자별 구성(AMI, 서브넷, 보안 그룹, 사용자 데이터)

EKS에 Karpenter 설치

# Prerequisiti: IRSA configurata per Karpenter
export CLUSTER_NAME="my-production-cluster"
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export AWS_REGION=eu-west-1

# Installa Karpenter con Helm
helm repo add karpenter https://charts.karpenter.sh/
helm repo update

helm upgrade --install karpenter karpenter/karpenter \
  --namespace karpenter \
  --create-namespace \
  --version 1.0.0 \
  --set serviceAccount.annotations."eks.amazonaws.com/role-arn"=arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterControllerRole \
  --set settings.clusterName=${CLUSTER_NAME} \
  --set settings.interruptionQueue=${CLUSTER_NAME} \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi

프로덕션용 NodePool 및 EC2NodeClass

# karpenter-nodepool.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: general-purpose
spec:
  template:
    metadata:
      labels:
        node-type: general-purpose
    spec:
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1
        kind: EC2NodeClass
        name: default
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand", "spot"]  # preferisce spot, fallback on-demand
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]  # compute, memory, general
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["2"]  # solo istanze di generazione 3+
        - key: karpenter.k8s.aws/instance-cpu
          operator: In
          values: ["4", "8", "16", "32"]
      taints: []
      expireAfter: 720h    # ricicla nodi ogni 30 giorni
      terminationGracePeriod: 48h
  limits:
    cpu: "500"             # max 500 vCPU in questo NodePool
    memory: 2000Gi
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 1m  # consolida nodi vuoti dopo 1 minuto
    budgets:
      - nodes: "20%"      # non drainare piu del 20% dei nodi alla volta
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2023
  amiSelectorTerms:
    - alias: al2023@latest   # usa sempre la AMI piu recente
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "my-production-cluster"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "my-production-cluster"
  instanceProfile: KarpenterNodeInstanceProfile
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 100Gi
        volumeType: gp3
        iops: 3000
        encrypted: true
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 1   # sicurezza: blocca access IMDSv1 da container
    httpTokens: required         # richiede IMDSv2

GPU 워크로드용 NodePool

# karpenter-gpu-nodepool.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: gpu-nodes
spec:
  template:
    metadata:
      labels:
        node-type: gpu
    spec:
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1
        kind: EC2NodeClass
        name: gpu-nodeclass
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]  # GPU spot non disponibile in tutte le zone
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ["g5", "p3", "p4d"]  # GPU instance families
      taints:
        - key: nvidia.com/gpu
          effect: NoSchedule      # solo Pod che tollerano questo taint
  limits:
    cpu: "128"
    memory: 1024Gi
    nvidia.com/gpu: "32"          # max 32 GPU in questo NodePool

통합 및 비용 최적화

# Forza la consolidazione immediata (utile per test)
kubectl annotate node  karpenter.sh/do-not-disrupt-

# Vedi i nodi creati da Karpenter
kubectl get nodes -l karpenter.sh/nodepool=general-purpose -o wide

# Vedi le decisioni di Karpenter in tempo reale
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter -f | grep -E "launched|terminated|consolidated"

# Vedi quanto sta costando ogni nodo (con Kubecost)
kubectl get nodeclaims -o json | jq '.items[] | {name: .metadata.name, type: .status.providerID, price: .metadata.annotations["karpenter.sh/nodepool"]}'

HPA, KEDA 및 Karpenter 결합

성숙한 제조 클러스터에서는 다음 세 가지 구성 요소가 시너지 효과를 발휘합니다.

케다 이벤트(Kafka 지연, SQS 깊이, Prometheus 쿼리)를 기반으로 Pod를 0에서 N으로 확장합니다.
카펜터 보류 중인 포드를 감지하고 30~60초 안에 필요한 정확한 특성으로 노드를 프로비저닝합니다.
VPA (꺼짐 모드에서) 수동으로 또는 CI/CD 파이프라인을 통해 적용하는 리소스 요청 권장 사항을 제공합니다.

생산에 권장되는 패턴

상태 비저장 서버 API: Prometheus의 KEDA(P99 대기 시간) + Karpenter 범용 NodePool
대기열 작업자: minReplicas=0을 사용하는 SQS/Kafka의 KEDA + 온디맨드/스팟 혼합을 사용하는 Karpenter
Database/StatefulSet: minReplicas >= 2인 자동 모드의 VPA, 메모리에 HPA 없음
일괄 작업: K8s 작업 마무리를 위한 KEDA ScaledJob(ScaledObject 아님)
KEDA와 함께 CPU에서 HPA를 사용하지 마세요. targetMetrics에서 충돌이 발생합니다.

배치용 KEDA ScaledJob

# keda-scaledjob.yaml - per batch job che terminano
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: ml-training-job
  namespace: production
spec:
  jobTargetRef:
    template:
      spec:
        containers:
          - name: trainer
            image: my-registry/ml-trainer:latest
            resources:
              requests:
                cpu: "2"
                memory: "4Gi"
                nvidia.com/gpu: "1"
              limits:
                nvidia.com/gpu: "1"
            tolerations:
              - key: nvidia.com/gpu
                operator: Exists
                effect: NoSchedule
        restartPolicy: Never
  pollingInterval: 30
  maxReplicaCount: 20
  successfulJobsHistoryLimit: 5
  failedJobsHistoryLimit: 3
  triggers:
    - type: aws-sqs-queue
      authenticationRef:
        name: aws-trigger-auth
      metadata:
        queueURL: https://sqs.eu-west-1.amazonaws.com/123456789/ml-jobs
        queueLength: "1"   # 1 job per task
        awsRegion: eu-west-1

HPA 및 KEDA 문제 해결

# HPA non scala? Controlla lo stato
kubectl describe hpa api-server-hpa -n production
# Cerca: "AbleToScale", "ScalingActive", "DesiredReplicas"
# Errore comune: "failed to get cpu utilization" = metrics-server non installato

# Verifica che metrics-server funzioni
kubectl top pods -n production
kubectl top nodes

# KEDA non scala a zero? Controlla il cooldownPeriod
kubectl get scaledobject kafka-consumer-scaler -n production -o yaml | grep -A5 "conditions"

# Karpenter non provisiona?
kubectl get pods --field-selector=status.phase=Pending -A
kubectl describe pod  | grep "Events" -A20
# Cerca: "0/N nodes are available" + il motivo del pending

# Vedi log Karpenter
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter --tail=50 | grep -i "error\|warning\|launched"

# Simula provisioning senza applicare
kubectl annotate pods  karpenter.sh/do-not-disrupt=true

모범 사례 및 안티 패턴

자동 확장 모범 사례

중요한 서비스에 대해서는 항상 minReplicas >= 2를 설정하십시오. 0부터 스케일링하려면 콜드 스타트가 필요합니다. 프로덕션 중인 API의 경우 최소 2개의 복제본을 유지하세요.
PodDisruptionBudget을 사용하세요. Karpenter/HPA가 통합 중에 너무 많은 포드를 소모하지 않도록 방지
정확한 리소스 요청 구성: HPA는 Resource.requests의 사용량 비율을 계산합니다. 너무 낮으면 절대 확장되지 않습니다.
엄격한 준비 상태 프로브: Kubernetes는 트래픽을 보내기 전에 Pod가 준비될 때까지 기다립니다. 준비 상태 프로브가 없으면 새로 확장된 포드가 준비되기 전에 트래픽을 수신합니다.
모니터 플랩: HPA가 몇 분마다 확장 및 축소되면 stabilizationWindowSeconds 축소의
Karpenter에서 토폴로지 확산 제약 조건을 사용합니다. 프로비저닝 중에도 고가용성을 위해 여러 영역에 포드를 배포합니다.

피해야 할 안티패턴

리소스 요청이 정의되지 않은 HPA: HPA는 컨테이너 사양의 요청 없이 사용률을 계산할 수 없습니다.
CPU/메모리의 VPA 자동 + HPA: 두 컨트롤러가 리소스를 놓고 경쟁하여 일관성 없는 확장이 발생합니다. 둘 다 원할 경우 사용자 정의 측정항목에 KEDA를 사용하세요.
maxReplicas가 너무 낮습니다. 최대 트래픽에 100개의 Pod가 필요하지만 maxReplicas와 20개가 필요한 경우 자동 확장으로는 충분하지 않으며 서비스 성능이 저하됩니다.
중단 없는 Karpenter 예산: 없이 disruption.budgets, Karpenter는 밤새 통합하는 동안 노드를 100% 비울 수 있습니다.
KEDA의 폴링 간격이 너무 낮습니다. un pollingInterval 외부 소스(SQS, 외부 API)에서 5초 동안 너무 많은 API 호출이 발생하고 제한이 발생할 수 있습니다.

결론 및 다음 단계

Kubernetes의 효과적인 자동 확장은 단일 솔루션이 아니라 다중 전략입니다. 수준: 포드의 이벤트 중심 확장을 위한 KEDA, 사용량 기반 확장을 위한 HPA 리소스 요청을 최적화하는 VPA, 프로비저닝을 위한 Karpenter 매듭이 빠르다. 이러한 도구를 함께 사용하면 비용을 30~50% 절감할 수 있습니다. 정적으로 프로비저닝된 클러스터와 비교하여 높은 SLA를 유지합니다.

성공의 열쇠는 리소스 요청의 정확한 구성입니다(여기서는 VPA가 도움이 됩니다). 확장할 올바른 측정항목 선택(항상 CPU 및 응답은 아님) 확장 동작 구성(안정화 기간, 속도 제한) 성능을 향상시키는 대신 성능을 악화시킬 수 있는 진동을 방지합니다.