FinOps devient critique en 2026 : budgets cloud explosent (+40%/an), gaspillage moyen 30%. Ce guide couvre tagging automatique, rightsizing, Kubecost pour Kubernetes et optimisation continue temps réel.
Découvrez Kubernetes, Kubecost pour les coûts Kubernetes, Terraform pour l'IaC, et Prometheus pour le monitoring pour mettre en œuvre une stratégie FinOps complète.
Plan
- Qu'est-ce que le FinOps ?
- Tagging et allocation des coûts
- Rightsizing instances et storage
- Kubecost : FinOps pour Kubernetes
- Reserved Instances et Savings Plans
- Spot instances et architecture résiliente
- Dashboards et alertes temps réel
- Culture FinOps et gouvernance
- Conclusion
Qu'est-ce que le FinOps ?
Définition et contexte 2026
FinOps = pratique culturelle et discipline qui allie finance, technologie et business pour optimiser les dépenses cloud.
Problématique :
- Dépenses cloud : +40% croissance annuelle
- Gaspillage moyen : 30% du budget cloud
- Visibilité : moins de 50% des entreprises connaissent leurs coûts réels
- Attribution : impossible de facturer équipes correctement
Objectif FinOps :
- Visibilité : coûts temps réel par équipe/projet
- Responsabilité : chaque équipe owner de ses coûts
- Optimisation : décisions basées sur ROI
- Prédictibilité : budgets et forecasts précis
Statistiques 2026
- 82% entreprises adoptent FinOps formellement
- $1.3T dépenses cloud globales
- 30% économies moyennes après FinOps
- 2-6 mois ROI typique initiative FinOps
- FinOps Engineer = top 10 rôle cloud demandé
Modèle FinOps Foundation
┌────────────────────────────────────────┐
│ INFORM (Visibilité) │
│ • Allocation coûts │
│ • Tagging resources │
│ • Forecasting │
│ • Benchmarking │
└────────────────────────────────────────┘
↓
┌────────────────────────────────────────┐
│ OPTIMIZE (Efficience) │
│ • Rightsizing │
│ • Reserved Instances │
│ • Spot instances │
│ • Storage optimization │
└────────────────────────────────────────┘
↓
┌────────────────────────────────────────┐
│ OPERATE (Gouvernance) │
│ • Policies automatiques │
│ • Alertes budgets │
│ • Chargeback/Showback │
│ • Culture FinOps │
└────────────────────────────────────────┘
Tagging et allocation des coûts
Stratégie de tagging
Tags essentiels :
Environment: prod/staging/devTeam: équipe propriétaireProject: projet/produitCostCenter: centre de coûts financeOwner: email responsableApplication: nom applicationManagedBy: terraform/manual/autoscaling
Policy de tagging AWS
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Action": [
"ec2:RunInstances",
"rds:CreateDBInstance",
"s3:CreateBucket",
"elasticloadbalancing:CreateLoadBalancer"
],
"Resource": "*",
"Condition": {
"StringNotLike": {
"aws:RequestTag/Environment": ["prod", "staging", "dev"],
"aws:RequestTag/Team": "*",
"aws:RequestTag/Project": "*",
"aws:RequestTag/CostCenter": "*"
}
}
}
]
}
Appliquer via AWS Organizations :
# Service Control Policy (SCP)
aws organizations create-policy \
--name RequireTagsPolicy \
--type SERVICE_CONTROL_POLICY \
--content file://require-tags-policy.json
# Attacher à OU
aws organizations attach-policy \
--policy-id p-abc123 \
--target-id ou-xyz789
Tag automatique avec Terraform
# variables.tf
variable "default_tags" {
type = map(string)
default = {
Environment = "prod"
ManagedBy = "terraform"
Team = "platform"
CostCenter = "engineering"
}
}
# provider.tf
provider "aws" {
region = "eu-west-1"
default_tags {
tags = var.default_tags
}
}
# main.tf - tags appliqués automatiquement
resource "aws_instance" "app" {
ami = "ami-12345678"
instance_type = "t3.medium"
tags = merge(
var.default_tags,
{
Name = "app-server"
Application = "payment-api"
Owner = "team-payments@company.com"
}
)
}
Tag Compliance Checker
#!/usr/bin/env python3
# check_tags.py
import boto3
import json
from datetime import datetime
REQUIRED_TAGS = ['Environment', 'Team', 'Project', 'CostCenter']
def check_ec2_tags():
ec2 = boto3.client('ec2')
instances = ec2.describe_instances()
non_compliant = []
for reservation in instances['Reservations']:
for instance in reservation['Instances']:
instance_id = instance['InstanceId']
tags = {tag['Key']: tag['Value'] for tag in instance.get('Tags', [])}
missing_tags = [tag for tag in REQUIRED_TAGS if tag not in tags]
if missing_tags:
non_compliant.append({
'InstanceId': instance_id,
'MissingTags': missing_tags,
'State': instance['State']['Name']
})
return non_compliant
def check_rds_tags():
rds = boto3.client('rds')
instances = rds.describe_db_instances()
non_compliant = []
for instance in instances['DBInstances']:
db_id = instance['DBInstanceIdentifier']
arn = instance['DBInstanceArn']
tags_response = rds.list_tags_for_resource(ResourceName=arn)
tags = {tag['Key']: tag['Value'] for tag in tags_response['TagList']}
missing_tags = [tag for tag in REQUIRED_TAGS if tag not in tags]
if missing_tags:
non_compliant.append({
'DBInstanceId': db_id,
'MissingTags': missing_tags,
'Status': instance['DBInstanceStatus']
})
return non_compliant
def main():
print("Checking tag compliance...")
ec2_issues = check_ec2_tags()
rds_issues = check_rds_tags()
report = {
'Timestamp': datetime.now().isoformat(),
'EC2': {
'Total': len(ec2_issues),
'NonCompliant': ec2_issues
},
'RDS': {
'Total': len(rds_issues),
'NonCompliant': rds_issues
}
}
print(json.dumps(report, indent=2))
# Slack notification si problèmes
if ec2_issues or rds_issues:
# send_slack_alert(report)
pass
if __name__ == '__main__':
main()
# Cron daily
0 9 * * * /usr/local/bin/check_tags.py | mail -s "Tag Compliance Report" finops@company.com
Rightsizing instances et storage
Analyse utilisation EC2
# CloudWatch metrics 14 jours
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=i-abc123 \
--start-time 2026-01-03T00:00:00Z \
--end-time 2026-01-17T00:00:00Z \
--period 3600 \
--statistics Average,Maximum
# Exemple output:
# Average: 12%
# Maximum: 28%
# → Instance oversized, rightsizing recommandé
Script rightsizing automatique
#!/usr/bin/env python3
# rightsizing_recommendations.py
import boto3
from datetime import datetime, timedelta
def get_cpu_utilization(instance_id, days=14):
cloudwatch = boto3.client('cloudwatch')
end_time = datetime.utcnow()
start_time = end_time - timedelta(days=days)
response = cloudwatch.get_metric_statistics(
Namespace='AWS/EC2',
MetricName='CPUUtilization',
Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
StartTime=start_time,
EndTime=end_time,
Period=3600,
Statistics=['Average', 'Maximum']
)
if not response['Datapoints']:
return None, None
avg = sum(d['Average'] for d in response['Datapoints']) / len(response['Datapoints'])
max_cpu = max(d['Maximum'] for d in response['Datapoints'])
return avg, max_cpu
def get_rightsizing_recommendation(instance_type, avg_cpu, max_cpu):
"""
Recommandations basées sur utilisation:
- avg < 20% et max < 40% : downsize
- avg > 70% ou max > 90% : upsize
"""
# Mapping instance types (simplifié)
downsize_map = {
't3.xlarge': 't3.large',
't3.large': 't3.medium',
't3.medium': 't3.small',
'm5.2xlarge': 'm5.xlarge',
'm5.xlarge': 'm5.large',
'm5.large': 'm5.medium'
}
upsize_map = {v: k for k, v in downsize_map.items()}
if avg_cpu < 20 and max_cpu < 40:
return downsize_map.get(instance_type, instance_type), "downsize"
elif avg_cpu > 70 or max_cpu > 90:
return upsize_map.get(instance_type, instance_type), "upsize"
return instance_type, "optimal"
def analyze_instances():
ec2 = boto3.client('ec2')
instances = ec2.describe_instances(
Filters=[{'Name': 'instance-state-name', 'Values': ['running']}]
)
recommendations = []
for reservation in instances['Reservations']:
for instance in reservation['Instances']:
instance_id = instance['InstanceId']
instance_type = instance['InstanceType']
tags = {tag['Key']: tag['Value'] for tag in instance.get('Tags', [])}
name = tags.get('Name', 'N/A')
avg_cpu, max_cpu = get_cpu_utilization(instance_id)
if avg_cpu is None:
continue
recommended_type, action = get_rightsizing_recommendation(
instance_type, avg_cpu, max_cpu
)
if action != "optimal":
# Calculer économies
current_cost = get_instance_cost(instance_type)
new_cost = get_instance_cost(recommended_type)
monthly_savings = (current_cost - new_cost) * 730 # heures/mois
recommendations.append({
'InstanceId': instance_id,
'Name': name,
'CurrentType': instance_type,
'AvgCPU': f"{avg_cpu:.1f}%",
'MaxCPU': f"{max_cpu:.1f}%",
'Recommendation': recommended_type,
'Action': action,
'MonthlySavings': f"${monthly_savings:.2f}"
})
return recommendations
def get_instance_cost(instance_type):
"""Prix on-demand par heure (simplifié - utiliser AWS Price List API)"""
prices = {
't3.small': 0.0208,
't3.medium': 0.0416,
't3.large': 0.0832,
't3.xlarge': 0.1664,
'm5.medium': 0.096,
'm5.large': 0.192,
'm5.xlarge': 0.384,
'm5.2xlarge': 0.768
}
return prices.get(instance_type, 0)
def main():
print("Analyzing EC2 instances for rightsizing...")
recommendations = analyze_instances()
print(f"\nFound {len(recommendations)} rightsizing opportunities:")
print("-" * 100)
for rec in recommendations:
print(f"Instance: {rec['InstanceId']} ({rec['Name']})")
print(f" Current: {rec['CurrentType']} - CPU: {rec['AvgCPU']} avg, {rec['MaxCPU']} max")
print(f" Recommendation: {rec['Action'].upper()} to {rec['Recommendation']}")
print(f" Monthly savings: {rec['MonthlySavings']}")
print()
total_savings = sum(float(r['MonthlySavings'].replace('$', '')) for r in recommendations)
print(f"Total potential monthly savings: ${total_savings:.2f}")
print(f"Annual savings: ${total_savings * 12:.2f}")
if __name__ == '__main__':
main()
Storage optimization
EBS volumes non attachés :
# Lister volumes disponibles
aws ec2 describe-volumes \
--filters Name=status,Values=available \
--query 'Volumes[*].[VolumeId,Size,VolumeType,CreateTime]' \
--output table
# Calculer coût
# gp3: $0.08/GB/month
# io2: $0.125/GB/month
# Snapshot puis delete volumes inutilisés
for vol in $(aws ec2 describe-volumes --filters Name=status,Values=available --query 'Volumes[*].VolumeId' --output text); do
aws ec2 create-snapshot --volume-id $vol --description "Backup before deletion"
aws ec2 delete-volume --volume-id $vol
done
S3 lifecycle policies :
{
"Rules": [
{
"Id": "MoveToIA",
"Status": "Enabled",
"Transitions": [
{
"Days": 90,
"StorageClass": "STANDARD_IA"
},
{
"Days": 180,
"StorageClass": "GLACIER"
},
{
"Days": 365,
"StorageClass": "DEEP_ARCHIVE"
}
],
"NoncurrentVersionTransitions": [
{
"NoncurrentDays": 30,
"StorageClass": "STANDARD_IA"
}
],
"NoncurrentVersionExpiration": {
"NoncurrentDays": 90
}
},
{
"Id": "DeleteOldBackups",
"Status": "Enabled",
"Prefix": "backups/",
"Expiration": {
"Days": 730
}
}
]
}
Kubecost : FinOps pour Kubernetes
Installation Kubecost
# Ajouter repo Helm
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm repo update
# Installer Kubecost
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost --create-namespace \
--set kubecostToken="aGVsbEB3b3JsZAo=" \
--set prometheus.server.persistentVolume.enabled=true \
--set prometheus.server.persistentVolume.size=32Gi
# Vérifier
kubectl get pods -n kubecost
# kubecost-cost-analyzer-xxx 3/3 Running
# kubecost-prometheus-server-xxx 2/2 Running
# Port-forward UI
kubectl port-forward -n kubecost svc/kubecost-cost-analyzer 9090:9090
# Accès: http://localhost:9090
Configuration cloud billing
AWS :
# kubecost-values.yaml
kubecostProductConfigs:
cloudIntegrationSecret: cloud-integration
awsSpotDataRegion: eu-west-1
awsSpotDataBucket: kubecost-spot-data-bucket
athenaProjectID: my-project
athenaBucketName: aws-athena-query-results-bucket
athenaRegion: eu-west-1
athenaDatabase: athenacurcfn_cur
athenaTable: cur
GCP :
kubecostProductConfigs:
cloudIntegrationSecret: cloud-integration
gcpBillingDataDataset: billing_export
gcpProjectID: my-gcp-project
# Secret pour credentials
kubectl create secret generic cloud-integration \
-n kubecost \
--from-file=cloud-integration.json=gcp-key.json
# Upgrade avec config
helm upgrade kubecost kubecost/cost-analyzer \
-n kubecost \
-f kubecost-values.yaml
Allocation par namespace/label
# API Kubecost - coûts par namespace
curl "http://localhost:9090/model/allocation?window=7d&aggregate=namespace"
# Output JSON:
{
"data": [
{
"namespace": "production",
"totalCost": 12456.78,
"cpuCost": 5432.10,
"ramCost": 4321.09,
"pvCost": 2703.59
},
{
"namespace": "staging",
"totalCost": 1234.56,
...
}
]
}
Allocation par label :
# Coûts par team
curl "http://localhost:9090/model/allocation?window=30d&aggregate=label:team"
# Coûts par application
curl "http://localhost:9090/model/allocation?window=30d&aggregate=label:app"
Savings recommendations
# API recommendations
curl "http://localhost:9090/model/savings"
# Output:
{
"clusterSizing": {
"overprovisioned": [
{
"namespace": "dev",
"deployment": "test-app",
"container": "app",
"currentCPU": "2000m",
"recommendedCPU": "500m",
"monthlySavings": 87.45
}
]
},
"abandonedWorkloads": [
{
"namespace": "staging",
"deployment": "old-api",
"monthlyCost": 234.56,
"reason": "0 requests last 30 days"
}
]
}
Kubecost Alerts
# kubecost-alerts.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: kubecost-alerts
namespace: kubecost
data:
alerts.json: |
{
"alerts": [
{
"type": "budget",
"name": "Production Budget Alert",
"threshold": 10000,
"window": "monthly",
"aggregation": "namespace",
"filter": "namespace=production",
"ownerContact": ["team-platform@company.com"]
},
{
"type": "spendChange",
"name": "Staging Spend Spike",
"threshold": 50,
"window": "1d",
"aggregation": "namespace",
"filter": "namespace=staging",
"ownerContact": ["team-dev@company.com"]
},
{
"type": "efficiency",
"name": "Low Efficiency Alert",
"threshold": 0.5,
"window": "7d",
"aggregation": "deployment",
"ownerContact": ["finops@company.com"]
}
]
}
Reserved Instances et Savings Plans
Analyse couverture RI
#!/usr/bin/env python3
# ri_coverage.py
import boto3
from datetime import datetime, timedelta
def analyze_ri_coverage():
ce = boto3.client('ce') # Cost Explorer
end = datetime.now().date()
start = end - timedelta(days=30)
response = ce.get_reservation_coverage(
TimePeriod={
'Start': start.strftime('%Y-%m-%d'),
'End': end.strftime('%Y-%m-%d')
},
Granularity='MONTHLY',
GroupBy=[
{'Type': 'DIMENSION', 'Key': 'INSTANCE_TYPE'},
{'Type': 'DIMENSION', 'Key': 'REGION'}
]
)
print("Reserved Instance Coverage Report")
print("=" * 80)
for item in response['CoveragesByTime']:
period = item['TimePeriod']
for group in item['Groups']:
instance_type = group['Attributes'].get('INSTANCE_TYPE', 'N/A')
region = group['Attributes'].get('REGION', 'N/A')
coverage = group['Coverage']
coverage_hours = coverage['CoverageHours']
on_demand_hours = float(coverage_hours.get('OnDemandHours', 0))
reserved_hours = float(coverage_hours.get('ReservedHours', 0))
total_hours = float(coverage_hours.get('TotalRunningHours', 0))
if total_hours > 0:
coverage_pct = (reserved_hours / total_hours) * 100
print(f"\n{instance_type} in {region}")
print(f" Total Hours: {total_hours:.0f}")
print(f" Reserved Hours: {reserved_hours:.0f}")
print(f" On-Demand Hours: {on_demand_hours:.0f}")
print(f" Coverage: {coverage_pct:.1f}%")
# Recommandation si couverture < 70%
if coverage_pct < 70 and total_hours > 500:
print(f" ⚠️ RECOMMENDATION: Consider purchasing RI")
def get_ri_recommendations():
ce = boto3.client('ce')
response = ce.get_reservation_purchase_recommendation(
Service='Amazon Elastic Compute Cloud - Compute',
AccountScope='PAYER',
LookbackPeriodInDays='THIRTY_DAYS',
TermInYears='ONE_YEAR',
PaymentOption='NO_UPFRONT'
)
print("\n" + "=" * 80)
print("RI Purchase Recommendations")
print("=" * 80)
for rec in response['Recommendations']:
details = rec['RecommendationDetails']
print(f"\nInstance Type: {details.get('InstanceType', 'N/A')}")
print(f"Region: {details.get('Region', 'N/A')}")
print(f"Recommended: {details.get('RecommendedNumberOfInstancesToPurchase', 0)} instances")
print(f"Monthly Savings: ${float(details.get('EstimatedMonthlySavingsAmount', 0)):.2f}")
print(f"Upfront Cost: ${float(details.get('UpfrontCost', 0)):.2f}")
print(f"Monthly Cost: ${float(details.get('RecurringStandardMonthlyCost', 0)):.2f}")
if __name__ == '__main__':
analyze_ri_coverage()
get_ri_recommendations()
Savings Plans vs Reserved Instances
| Critère | Reserved Instances | Savings Plans |
| Flexibilité | Fixe (type/région) | Flexible (type/région/famille) |
| Discount | 40-72% | 40-66% |
| Engagement | Instance spécifique | $/heure compute |
| Scope | EC2 only | EC2, Lambda, Fargate |
| Changement | Modifier/échanger | Automatique |
| Recommandé pour | Workloads stables | Workloads variables |
Recommandation 2026 : Savings Plans pour 60-70% base load, Spot pour workloads flexibles.
Spot instances et architecture résiliente
Spot instances : 70-90% discount
Cas d'usage :
- CI/CD runners
- Batch processing
- Data analytics
- Dev/test environments
- Stateless applications
Kubernetes avec Spot instances
# spot-nodegroup.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: production
region: eu-west-1
nodeGroups:
# On-Demand pour workloads critiques
- name: on-demand
instanceType: m5.xlarge
minSize: 3
maxSize: 10
desiredCapacity: 5
labels:
workload-type: critical
taints:
- key: workload-type
value: critical
effect: NoSchedule
# Spot pour workloads tolérants
- name: spot
instancesDistribution:
instanceTypes:
- m5.xlarge
- m5a.xlarge
- m5n.xlarge
onDemandBaseCapacity: 0
onDemandPercentageAboveBaseCapacity: 0
spotInstancePools: 3
minSize: 0
maxSize: 50
desiredCapacity: 10
labels:
workload-type: flexible
taints:
- key: workload-type
value: flexible
effect: NoSchedule
Deployment spot-friendly :
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-processor
spec:
replicas: 10
selector:
matchLabels:
app: batch-processor
template:
metadata:
labels:
app: batch-processor
spec:
# Tolérer Spot instances
tolerations:
- key: workload-type
operator: Equal
value: flexible
effect: NoSchedule
# Affinité Spot nodes
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: workload-type
operator: In
values:
- flexible
# Graceful shutdown
terminationGracePeriodSeconds: 120
containers:
- name: processor
image: batch-processor:v1
# Handle SIGTERM properly
lifecycle:
preStop:
exec:
command: ['/bin/sh', '-c', 'sleep 15']
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
Spot interruption handler
# Installer AWS Node Termination Handler
helm repo add eks https://aws.github.io/eks-charts
helm install aws-node-termination-handler \
eks/aws-node-termination-handler \
--namespace kube-system \
--set enableSpotInterruptionDraining=true \
--set enableScheduledEventDraining=true
Handler personnalisé :
#!/usr/bin/env python3
# spot_handler.py - sur chaque spot node
import requests
import time
import subprocess
METADATA_URL = "http://169.254.169.254/latest/meta-data/spot/instance-action"
def check_spot_termination():
try:
response = requests.get(METADATA_URL, timeout=1)
if response.status_code == 200:
return True, response.json()
except:
pass
return False, None
def drain_node():
# Cordon node
subprocess.run(['kubectl', 'cordon', NODE_NAME])
# Drain with grace period
subprocess.run([
'kubectl', 'drain', NODE_NAME,
'--ignore-daemonsets',
'--delete-emptydir-data',
'--grace-period=90'
])
if __name__ == '__main__':
while True:
terminating, action = check_spot_termination()
if terminating:
print(f"Spot termination notice received: {action}")
drain_node()
break
time.sleep(5)
Dashboards et alertes temps réel
CloudWatch Billing Dashboard
#!/usr/bin/env python3
# create_billing_dashboard.py
import boto3
import json
cloudwatch = boto3.client('cloudwatch')
dashboard_body = {
"widgets": [
{
"type": "metric",
"properties": {
"metrics": [
["AWS/Billing", "EstimatedCharges", {"stat": "Maximum"}]
],
"period": 21600,
"stat": "Maximum",
"region": "us-east-1",
"title": "Total AWS Charges (MTD)",
"yAxis": {
"left": {
"label": "USD"
}
}
}
},
{
"type": "metric",
"properties": {
"metrics": [
["AWS/Billing", "EstimatedCharges", {"dimensions": {"ServiceName": "AmazonEC2"}}],
["...", {"dimensions": {"ServiceName": "AmazonRDS"}}],
["...", {"dimensions": {"ServiceName": "AmazonS3"}}],
["...", {"dimensions": {"ServiceName": "AmazonEKS"}}]
],
"period": 21600,
"stat": "Maximum",
"region": "us-east-1",
"title": "Charges by Service",
"yAxis": {
"left": {
"label": "USD"
}
}
}
}
]
}
cloudwatch.put_dashboard(
DashboardName='FinOps-Billing',
DashboardBody=json.dumps(dashboard_body)
)
print("Dashboard created: FinOps-Billing")
Budget Alerts
# AWS Budget avec alertes
aws budgets create-budget \
--account-id 123456789012 \
--budget file://budget.json \
--notifications-with-subscribers file://notifications.json
// budget.json
{
"BudgetName": "Monthly-Production-Budget",
"BudgetLimit": {
"Amount": "10000",
"Unit": "USD"
},
"TimeUnit": "MONTHLY",
"BudgetType": "COST",
"CostFilters": {
"TagKeyValue": ["user:Environment$production"]
}
}
// notifications.json
[
{
"Notification": {
"NotificationType": "ACTUAL",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 80,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [
{
"SubscriptionType": "EMAIL",
"Address": "finops@company.com"
},
{
"SubscriptionType": "SNS",
"Address": "arn:aws:sns:eu-west-1:123456789012:budget-alerts"
}
]
},
{
"Notification": {
"NotificationType": "FORECASTED",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 100,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [
{
"SubscriptionType": "EMAIL",
"Address": "cto@company.com"
}
]
}
]
Culture FinOps et gouvernance
Chargeback vs Showback
Showback : Transparence coûts, pas de facturation
# showback_report.py - email hebdomadaire équipes
def generate_showback_report(team):
costs = get_team_costs(team, days=7)
report = f"""
FinOps Weekly Report - {team}
Last 7 days costs: ${costs['total']:.2f}
Breakdown:
- Compute (EC2/EKS): ${costs['compute']:.2f}
- Storage (EBS/S3): ${costs['storage']:.2f}
- Database (RDS): ${costs['database']:.2f}
- Networking: ${costs['network']:.2f}
Trend: {costs['trend']}% vs last week
Top 5 resources:
{format_top_resources(costs['top_resources'])}
Optimization opportunities:
- {len(costs['recommendations'])} rightsizing recommendations
- Potential monthly savings: ${costs['potential_savings']:.2f}
View detailed breakdown: https://finops.company.com/teams/{team}
"""
send_email(f"{team}@company.com", "Weekly FinOps Report", report)
Chargeback : Facturation réelle aux équipes
# chargeback_invoice.py - mensuel
def generate_chargeback_invoice(team, month):
costs = get_team_costs(team, month=month)
# Appliquer markup (overhead infra)
markup = 1.15 # 15% overhead
total_with_markup = costs['total'] * markup
invoice = {
'team': team,
'period': month,
'subtotal': costs['total'],
'markup': costs['total'] * 0.15,
'total': total_with_markup,
'cost_center': get_cost_center(team)
}
# Export vers ERP
export_to_erp(invoice)
return invoice
FinOps KPIs
# finops_kpis.py - dashboard exécutif
def calculate_finops_kpis():
return {
# Coût unitaire
'cost_per_customer': total_costs / total_customers,
'cost_per_transaction': total_costs / total_transactions,
'cost_per_api_call': total_costs / total_api_calls,
# Efficience
'compute_utilization': used_compute / provisioned_compute,
'storage_utilization': used_storage / provisioned_storage,
'waste_percentage': wasted_spend / total_spend,
# Couverture
'ri_coverage': reserved_hours / total_hours,
'spot_usage': spot_hours / total_hours,
# Gouvernance
'tagged_resources': tagged / total_resources,
'budget_adherence': actual_spend / budgeted_spend
}
Checklist FinOps
✅ Phase 1 : Visibilité (Mois 1)
- Tagging strategy définie et appliquée
- Compliance tagging ≥90%
- Cost Explorer configuré
- Dashboards billing créés
- Export données vers data lake
✅ Phase 2 : Analyse (Mois 2)
- Analyse utilisation EC2/RDS
- Rightsizing recommendations
- Storage optimization (EBS/S3)
- Identification ressources idle
- Quick wins implémentés (20-30% savings)
✅ Phase 3 : Optimisation (Mois 3-4)
- Kubecost déployé (si K8s)
- RI/Savings Plans achetés (60-70% base)
- Spot instances architecture
- Budgets et alertes actifs
- Showback rapports hebdomadaires
✅ Phase 4 : Gouvernance (Mois 5-6)
- Policies automatiques (tag enforcement)
- Chargeback implémenté
- FinOps reviews mensuelles
- KPIs suivis et reportés
- Culture FinOps établie
Conclusion
FinOps devient essentiel en 2026 avec budgets cloud croissants. Tagging, rightsizing, Reserved Instances et Kubecost permettent d'économiser 30-50% tout en maintenant performance et agilité.
Points clés :
- Tagging = fondation visibilité
- Rightsizing = quick wins 20-30%
- Kubecost = FinOps Kubernetes essentiel
- RI/Savings Plans = 40-70% discount base load
- Spot = 70-90% discount workloads flexibles
Gains typiques :
- Économies : 30-50% budget cloud
- ROI : 2-6 mois
- Visibilité : 100% ressources taggées
- Efficience : +40% utilisation compute
- Waste : -80% ressources idle
Actions prioritaires :
- Implémenter tagging obligatoire
- Analyse rightsizing EC2/RDS
- Déployer Kubecost (si K8s)
- Acheter RI/Savings Plans base load
- Architecture Spot instances


