FinOps devient critique en 2026 : budgets cloud explosent (+40%/an), gaspillage moyen 30%. Ce guide couvre tagging automatique, rightsizing, Kubecost pour Kubernetes et optimisation continue temps réel.
Plan
- Qu'est-ce que le FinOps ?
- Tagging et allocation des coûts
- Rightsizing instances et storage
- Kubecost : FinOps pour Kubernetes
- Reserved Instances et Savings Plans
- Spot instances et architecture résiliente
- Dashboards et alertes temps réel
- Culture FinOps et gouvernance
- Conclusion
Qu'est-ce que le FinOps ?
Définition et contexte 2026
FinOps = pratique culturelle et discipline qui allie finance, technologie et business pour optimiser les dépenses cloud.
Problématique :
- Dépenses cloud : +40% croissance annuelle
- Gaspillage moyen : 30% du budget cloud
- Visibilité : <50% entreprises connaissent leurs coûts réels
- Attribution : impossible de facturer équipes correctement
Objectif FinOps :
- Visibilité : coûts temps réel par équipe/projet
- Responsabilité : chaque équipe owner de ses coûts
- Optimisation : décisions basées sur ROI
- Prédictibilité : budgets et forecasts précis
Statistiques 2026
- 82% entreprises adoptent FinOps formellement
- $1.3T dépenses cloud globales
- 30% économies moyennes après FinOps
- 2-6 mois ROI typique initiative FinOps
- FinOps Engineer = top 10 rôle cloud demandé
Modèle FinOps Foundation
┌────────────────────────────────────────┐
│ INFORM (Visibilité) │
│ • Allocation coûts │
│ • Tagging resources │
│ • Forecasting │
│ • Benchmarking │
└────────────────────────────────────────┘
↓
┌────────────────────────────────────────┐
│ OPTIMIZE (Efficience) │
│ • Rightsizing │
│ • Reserved Instances │
│ • Spot instances │
│ • Storage optimization │
└────────────────────────────────────────┘
↓
┌────────────────────────────────────────┐
│ OPERATE (Gouvernance) │
│ • Policies automatiques │
│ • Alertes budgets │
│ • Chargeback/Showback │
│ • Culture FinOps │
└────────────────────────────────────────┘
Tagging et allocation des coûts
Stratégie de tagging
Tags essentiels :
Environment: prod/staging/devTeam: équipe propriétaireProject: projet/produitCostCenter: centre de coûts financeOwner: email responsableApplication: nom applicationManagedBy: terraform/manual/autoscaling
Policy de tagging AWS
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Action": [
"ec2:RunInstances",
"rds:CreateDBInstance",
"s3:CreateBucket",
"elasticloadbalancing:CreateLoadBalancer"
],
"Resource": "*",
"Condition": {
"StringNotLike": {
"aws:RequestTag/Environment": ["prod", "staging", "dev"],
"aws:RequestTag/Team": "*",
"aws:RequestTag/Project": "*",
"aws:RequestTag/CostCenter": "*"
}
}
}
]
}
Appliquer via AWS Organizations :
# Service Control Policy (SCP)
aws organizations create-policy \
--name RequireTagsPolicy \
--type SERVICE_CONTROL_POLICY \
--content file://require-tags-policy.json
# Attacher à OU
aws organizations attach-policy \
--policy-id p-abc123 \
--target-id ou-xyz789
Tag automatique avec Terraform
# variables.tf
variable "default_tags" {
type = map(string)
default = {
Environment = "prod"
ManagedBy = "terraform"
Team = "platform"
CostCenter = "engineering"
}
}
# provider.tf
provider "aws" {
region = "eu-west-1"
default_tags {
tags = var.default_tags
}
}
# main.tf - tags appliqués automatiquement
resource "aws_instance" "app" {
ami = "ami-12345678"
instance_type = "t3.medium"
tags = merge(
var.default_tags,
{
Name = "app-server"
Application = "payment-api"
Owner = "team-payments@company.com"
}
)
}
Tag Compliance Checker
#!/usr/bin/env python3
# check_tags.py
import boto3
import json
from datetime import datetime
REQUIRED_TAGS = ['Environment', 'Team', 'Project', 'CostCenter']
def check_ec2_tags():
ec2 = boto3.client('ec2')
instances = ec2.describe_instances()
non_compliant = []
for reservation in instances['Reservations']:
for instance in reservation['Instances']:
instance_id = instance['InstanceId']
tags = {tag['Key']: tag['Value'] for tag in instance.get('Tags', [])}
missing_tags = [tag for tag in REQUIRED_TAGS if tag not in tags]
if missing_tags:
non_compliant.append({
'InstanceId': instance_id,
'MissingTags': missing_tags,
'State': instance['State']['Name']
})
return non_compliant
def check_rds_tags():
rds = boto3.client('rds')
instances = rds.describe_db_instances()
non_compliant = []
for instance in instances['DBInstances']:
db_id = instance['DBInstanceIdentifier']
arn = instance['DBInstanceArn']
tags_response = rds.list_tags_for_resource(ResourceName=arn)
tags = {tag['Key']: tag['Value'] for tag in tags_response['TagList']}
missing_tags = [tag for tag in REQUIRED_TAGS if tag not in tags]
if missing_tags:
non_compliant.append({
'DBInstanceId': db_id,
'MissingTags': missing_tags,
'Status': instance['DBInstanceStatus']
})
return non_compliant
def main():
print("Checking tag compliance...")
ec2_issues = check_ec2_tags()
rds_issues = check_rds_tags()
report = {
'Timestamp': datetime.now().isoformat(),
'EC2': {
'Total': len(ec2_issues),
'NonCompliant': ec2_issues
},
'RDS': {
'Total': len(rds_issues),
'NonCompliant': rds_issues
}
}
print(json.dumps(report, indent=2))
# Slack notification si problèmes
if ec2_issues or rds_issues:
# send_slack_alert(report)
pass
if __name__ == '__main__':
main()
# Cron daily
0 9 * * * /usr/local/bin/check_tags.py | mail -s "Tag Compliance Report" finops@company.com
Rightsizing instances et storage
Analyse utilisation EC2
# CloudWatch metrics 14 jours
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=i-abc123 \
--start-time 2026-01-03T00:00:00Z \
--end-time 2026-01-17T00:00:00Z \
--period 3600 \
--statistics Average,Maximum
# Exemple output:
# Average: 12%
# Maximum: 28%
# → Instance oversized, rightsizing recommandé
Script rightsizing automatique
#!/usr/bin/env python3
# rightsizing_recommendations.py
import boto3
from datetime import datetime, timedelta
def get_cpu_utilization(instance_id, days=14):
cloudwatch = boto3.client('cloudwatch')
end_time = datetime.utcnow()
start_time = end_time - timedelta(days=days)
response = cloudwatch.get_metric_statistics(
Namespace='AWS/EC2',
MetricName='CPUUtilization',
Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
StartTime=start_time,
EndTime=end_time,
Period=3600,
Statistics=['Average', 'Maximum']
)
if not response['Datapoints']:
return None, None
avg = sum(d['Average'] for d in response['Datapoints']) / len(response['Datapoints'])
max_cpu = max(d['Maximum'] for d in response['Datapoints'])
return avg, max_cpu
def get_rightsizing_recommendation(instance_type, avg_cpu, max_cpu):
"""
Recommandations basées sur utilisation:
- avg < 20% et max < 40% : downsize
- avg > 70% ou max > 90% : upsize
"""
# Mapping instance types (simplifié)
downsize_map = {
't3.xlarge': 't3.large',
't3.large': 't3.medium',
't3.medium': 't3.small',
'm5.2xlarge': 'm5.xlarge',
'm5.xlarge': 'm5.large',
'm5.large': 'm5.medium'
}
upsize_map = {v: k for k, v in downsize_map.items()}
if avg_cpu < 20 and max_cpu < 40:
return downsize_map.get(instance_type, instance_type), "downsize"
elif avg_cpu > 70 or max_cpu > 90:
return upsize_map.get(instance_type, instance_type), "upsize"
return instance_type, "optimal"
def analyze_instances():
ec2 = boto3.client('ec2')
instances = ec2.describe_instances(
Filters=[{'Name': 'instance-state-name', 'Values': ['running']}]
)
recommendations = []
for reservation in instances['Reservations']:
for instance in reservation['Instances']:
instance_id = instance['InstanceId']
instance_type = instance['InstanceType']
tags = {tag['Key']: tag['Value'] for tag in instance.get('Tags', [])}
name = tags.get('Name', 'N/A')
avg_cpu, max_cpu = get_cpu_utilization(instance_id)
if avg_cpu is None:
continue
recommended_type, action = get_rightsizing_recommendation(
instance_type, avg_cpu, max_cpu
)
if action != "optimal":
# Calculer économies
current_cost = get_instance_cost(instance_type)
new_cost = get_instance_cost(recommended_type)
monthly_savings = (current_cost - new_cost) * 730 # heures/mois
recommendations.append({
'InstanceId': instance_id,
'Name': name,
'CurrentType': instance_type,
'AvgCPU': f"{avg_cpu:.1f}%",
'MaxCPU': f"{max_cpu:.1f}%",
'Recommendation': recommended_type,
'Action': action,
'MonthlySavings': f"${monthly_savings:.2f}"
})
return recommendations
def get_instance_cost(instance_type):
"""Prix on-demand par heure (simplifié - utiliser AWS Price List API)"""
prices = {
't3.small': 0.0208,
't3.medium': 0.0416,
't3.large': 0.0832,
't3.xlarge': 0.1664,
'm5.medium': 0.096,
'm5.large': 0.192,
'm5.xlarge': 0.384,
'm5.2xlarge': 0.768
}
return prices.get(instance_type, 0)
def main():
print("Analyzing EC2 instances for rightsizing...")
recommendations = analyze_instances()
print(f"\nFound {len(recommendations)} rightsizing opportunities:")
print("-" * 100)
for rec in recommendations:
print(f"Instance: {rec['InstanceId']} ({rec['Name']})")
print(f" Current: {rec['CurrentType']} - CPU: {rec['AvgCPU']} avg, {rec['MaxCPU']} max")
print(f" Recommendation: {rec['Action'].upper()} to {rec['Recommendation']}")
print(f" Monthly savings: {rec['MonthlySavings']}")
print()
total_savings = sum(float(r['MonthlySavings'].replace('$', '')) for r in recommendations)
print(f"Total potential monthly savings: ${total_savings:.2f}")
print(f"Annual savings: ${total_savings * 12:.2f}")
if __name__ == '__main__':
main()
Storage optimization
EBS volumes non attachés :
# Lister volumes disponibles
aws ec2 describe-volumes \
--filters Name=status,Values=available \
--query 'Volumes[*].[VolumeId,Size,VolumeType,CreateTime]' \
--output table
# Calculer coût
# gp3: $0.08/GB/month
# io2: $0.125/GB/month
# Snapshot puis delete volumes inutilisés
for vol in $(aws ec2 describe-volumes --filters Name=status,Values=available --query 'Volumes[*].VolumeId' --output text); do
aws ec2 create-snapshot --volume-id $vol --description "Backup before deletion"
aws ec2 delete-volume --volume-id $vol
done
S3 lifecycle policies :
{
"Rules": [
{
"Id": "MoveToIA",
"Status": "Enabled",
"Transitions": [
{
"Days": 90,
"StorageClass": "STANDARD_IA"
},
{
"Days": 180,
"StorageClass": "GLACIER"
},
{
"Days": 365,
"StorageClass": "DEEP_ARCHIVE"
}
],
"NoncurrentVersionTransitions": [
{
"NoncurrentDays": 30,
"StorageClass": "STANDARD_IA"
}
],
"NoncurrentVersionExpiration": {
"NoncurrentDays": 90
}
},
{
"Id": "DeleteOldBackups",
"Status": "Enabled",
"Prefix": "backups/",
"Expiration": {
"Days": 730
}
}
]
}
Kubecost : FinOps pour Kubernetes
Installation Kubecost
# Ajouter repo Helm
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm repo update
# Installer Kubecost
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost --create-namespace \
--set kubecostToken="aGVsbEB3b3JsZAo=" \
--set prometheus.server.persistentVolume.enabled=true \
--set prometheus.server.persistentVolume.size=32Gi
# Vérifier
kubectl get pods -n kubecost
# kubecost-cost-analyzer-xxx 3/3 Running
# kubecost-prometheus-server-xxx 2/2 Running
# Port-forward UI
kubectl port-forward -n kubecost svc/kubecost-cost-analyzer 9090:9090
# Accès: http://localhost:9090
Configuration cloud billing
AWS :
# kubecost-values.yaml
kubecostProductConfigs:
cloudIntegrationSecret: cloud-integration
awsSpotDataRegion: eu-west-1
awsSpotDataBucket: kubecost-spot-data-bucket
athenaProjectID: my-project
athenaBucketName: aws-athena-query-results-bucket
athenaRegion: eu-west-1
athenaDatabase: athenacurcfn_cur
athenaTable: cur
GCP :
kubecostProductConfigs:
cloudIntegrationSecret: cloud-integration
gcpBillingDataDataset: billing_export
gcpProjectID: my-gcp-project
# Secret pour credentials
kubectl create secret generic cloud-integration \
-n kubecost \
--from-file=cloud-integration.json=gcp-key.json
# Upgrade avec config
helm upgrade kubecost kubecost/cost-analyzer \
-n kubecost \
-f kubecost-values.yaml
Allocation par namespace/label
# API Kubecost - coûts par namespace
curl "http://localhost:9090/model/allocation?window=7d&aggregate=namespace"
# Output JSON:
{
"data": [
{
"namespace": "production",
"totalCost": 12456.78,
"cpuCost": 5432.10,
"ramCost": 4321.09,
"pvCost": 2703.59
},
{
"namespace": "staging",
"totalCost": 1234.56,
...
}
]
}
Allocation par label :
# Coûts par team
curl "http://localhost:9090/model/allocation?window=30d&aggregate=label:team"
# Coûts par application
curl "http://localhost:9090/model/allocation?window=30d&aggregate=label:app"
Savings recommendations
# API recommendations
curl "http://localhost:9090/model/savings"
# Output:
{
"clusterSizing": {
"overprovisioned": [
{
"namespace": "dev",
"deployment": "test-app",
"container": "app",
"currentCPU": "2000m",
"recommendedCPU": "500m",
"monthlySavings": 87.45
}
]
},
"abandonedWorkloads": [
{
"namespace": "staging",
"deployment": "old-api",
"monthlyCost": 234.56,
"reason": "0 requests last 30 days"
}
]
}
Kubecost Alerts
# kubecost-alerts.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: kubecost-alerts
namespace: kubecost
data:
alerts.json: |
{
"alerts": [
{
"type": "budget",
"name": "Production Budget Alert",
"threshold": 10000,
"window": "monthly",
"aggregation": "namespace",
"filter": "namespace=production",
"ownerContact": ["team-platform@company.com"]
},
{
"type": "spendChange",
"name": "Staging Spend Spike",
"threshold": 50,
"window": "1d",
"aggregation": "namespace",
"filter": "namespace=staging",
"ownerContact": ["team-dev@company.com"]
},
{
"type": "efficiency",
"name": "Low Efficiency Alert",
"threshold": 0.5,
"window": "7d",
"aggregation": "deployment",
"ownerContact": ["finops@company.com"]
}
]
}
Reserved Instances et Savings Plans
Analyse couverture RI
#!/usr/bin/env python3
# ri_coverage.py
import boto3
from datetime import datetime, timedelta
def analyze_ri_coverage():
ce = boto3.client('ce') # Cost Explorer
end = datetime.now().date()
start = end - timedelta(days=30)
response = ce.get_reservation_coverage(
TimePeriod={
'Start': start.strftime('%Y-%m-%d'),
'End': end.strftime('%Y-%m-%d')
},
Granularity='MONTHLY',
GroupBy=[
{'Type': 'DIMENSION', 'Key': 'INSTANCE_TYPE'},
{'Type': 'DIMENSION', 'Key': 'REGION'}
]
)
print("Reserved Instance Coverage Report")
print("=" * 80)
for item in response['CoveragesByTime']:
period = item['TimePeriod']
for group in item['Groups']:
instance_type = group['Attributes'].get('INSTANCE_TYPE', 'N/A')
region = group['Attributes'].get('REGION', 'N/A')
coverage = group['Coverage']
coverage_hours = coverage['CoverageHours']
on_demand_hours = float(coverage_hours.get('OnDemandHours', 0))
reserved_hours = float(coverage_hours.get('ReservedHours', 0))
total_hours = float(coverage_hours.get('TotalRunningHours', 0))
if total_hours > 0:
coverage_pct = (reserved_hours / total_hours) * 100
print(f"\n{instance_type} in {region}")
print(f" Total Hours: {total_hours:.0f}")
print(f" Reserved Hours: {reserved_hours:.0f}")
print(f" On-Demand Hours: {on_demand_hours:.0f}")
print(f" Coverage: {coverage_pct:.1f}%")
# Recommandation si couverture < 70%
if coverage_pct < 70 and total_hours > 500:
print(f" ⚠️ RECOMMENDATION: Consider purchasing RI")
def get_ri_recommendations():
ce = boto3.client('ce')
response = ce.get_reservation_purchase_recommendation(
Service='Amazon Elastic Compute Cloud - Compute',
AccountScope='PAYER',
LookbackPeriodInDays='THIRTY_DAYS',
TermInYears='ONE_YEAR',
PaymentOption='NO_UPFRONT'
)
print("\n" + "=" * 80)
print("RI Purchase Recommendations")
print("=" * 80)
for rec in response['Recommendations']:
details = rec['RecommendationDetails']
print(f"\nInstance Type: {details.get('InstanceType', 'N/A')}")
print(f"Region: {details.get('Region', 'N/A')}")
print(f"Recommended: {details.get('RecommendedNumberOfInstancesToPurchase', 0)} instances")
print(f"Monthly Savings: ${float(details.get('EstimatedMonthlySavingsAmount', 0)):.2f}")
print(f"Upfront Cost: ${float(details.get('UpfrontCost', 0)):.2f}")
print(f"Monthly Cost: ${float(details.get('RecurringStandardMonthlyCost', 0)):.2f}")
if __name__ == '__main__':
analyze_ri_coverage()
get_ri_recommendations()
Savings Plans vs Reserved Instances
| Critère | Reserved Instances | Savings Plans |
| Flexibilité | Fixe (type/région) | Flexible (type/région/famille) |
| Discount | 40-72% | 40-66% |
| Engagement | Instance spécifique | $/heure compute |
| Scope | EC2 only | EC2, Lambda, Fargate |
| Changement | Modifier/échanger | Automatique |
| Recommandé pour | Workloads stables | Workloads variables |
Recommandation 2026 : Savings Plans pour 60-70% base load, Spot pour workloads flexibles.
Spot instances et architecture résiliente
Spot instances : 70-90% discount
Cas d'usage :
- CI/CD runners
- Batch processing
- Data analytics
- Dev/test environments
- Stateless applications
Kubernetes avec Spot instances
# spot-nodegroup.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: production
region: eu-west-1
nodeGroups:
# On-Demand pour workloads critiques
- name: on-demand
instanceType: m5.xlarge
minSize: 3
maxSize: 10
desiredCapacity: 5
labels:
workload-type: critical
taints:
- key: workload-type
value: critical
effect: NoSchedule
# Spot pour workloads tolérants
- name: spot
instancesDistribution:
instanceTypes:
- m5.xlarge
- m5a.xlarge
- m5n.xlarge
onDemandBaseCapacity: 0
onDemandPercentageAboveBaseCapacity: 0
spotInstancePools: 3
minSize: 0
maxSize: 50
desiredCapacity: 10
labels:
workload-type: flexible
taints:
- key: workload-type
value: flexible
effect: NoSchedule
Deployment spot-friendly :
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-processor
spec:
replicas: 10
selector:
matchLabels:
app: batch-processor
template:
metadata:
labels:
app: batch-processor
spec:
# Tolérer Spot instances
tolerations:
- key: workload-type
operator: Equal
value: flexible
effect: NoSchedule
# Affinité Spot nodes
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: workload-type
operator: In
values:
- flexible
# Graceful shutdown
terminationGracePeriodSeconds: 120
containers:
- name: processor
image: batch-processor:v1
# Handle SIGTERM properly
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"]
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
Spot interruption handler
# Installer AWS Node Termination Handler
helm repo add eks https://aws.github.io/eks-charts
helm install aws-node-termination-handler \
eks/aws-node-termination-handler \
--namespace kube-system \
--set enableSpotInterruptionDraining=true \
--set enableScheduledEventDraining=true
Handler personnalisé :
#!/usr/bin/env python3
# spot_handler.py - sur chaque spot node
import requests
import time
import subprocess
METADATA_URL = "http://169.254.169.254/latest/meta-data/spot/instance-action"
def check_spot_termination():
try:
response = requests.get(METADATA_URL, timeout=1)
if response.status_code == 200:
return True, response.json()
except:
pass
return False, None
def drain_node():
# Cordon node
subprocess.run(['kubectl', 'cordon', NODE_NAME])
# Drain with grace period
subprocess.run([
'kubectl', 'drain', NODE_NAME,
'--ignore-daemonsets',
'--delete-emptydir-data',
'--grace-period=90'
])
if __name__ == '__main__':
while True:
terminating, action = check_spot_termination()
if terminating:
print(f"Spot termination notice received: {action}")
drain_node()
break
time.sleep(5)
Dashboards et alertes temps réel
CloudWatch Billing Dashboard
#!/usr/bin/env python3
# create_billing_dashboard.py
import boto3
import json
cloudwatch = boto3.client('cloudwatch')
dashboard_body = {
"widgets": [
{
"type": "metric",
"properties": {
"metrics": [
["AWS/Billing", "EstimatedCharges", {"stat": "Maximum"}]
],
"period": 21600,
"stat": "Maximum",
"region": "us-east-1",
"title": "Total AWS Charges (MTD)",
"yAxis": {
"left": {
"label": "USD"
}
}
}
},
{
"type": "metric",
"properties": {
"metrics": [
["AWS/Billing", "EstimatedCharges", {"dimensions": {"ServiceName": "AmazonEC2"}}],
["...", {"dimensions": {"ServiceName": "AmazonRDS"}}],
["...", {"dimensions": {"ServiceName": "AmazonS3"}}],
["...", {"dimensions": {"ServiceName": "AmazonEKS"}}]
],
"period": 21600,
"stat": "Maximum",
"region": "us-east-1",
"title": "Charges by Service",
"yAxis": {
"left": {
"label": "USD"
}
}
}
}
]
}
cloudwatch.put_dashboard(
DashboardName='FinOps-Billing',
DashboardBody=json.dumps(dashboard_body)
)
print("Dashboard created: FinOps-Billing")
Budget Alerts
# AWS Budget avec alertes
aws budgets create-budget \
--account-id 123456789012 \
--budget file://budget.json \
--notifications-with-subscribers file://notifications.json
// budget.json
{
"BudgetName": "Monthly-Production-Budget",
"BudgetLimit": {
"Amount": "10000",
"Unit": "USD"
},
"TimeUnit": "MONTHLY",
"BudgetType": "COST",
"CostFilters": {
"TagKeyValue": ["user:Environment$production"]
}
}
// notifications.json
[
{
"Notification": {
"NotificationType": "ACTUAL",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 80,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [
{
"SubscriptionType": "EMAIL",
"Address": "finops@company.com"
},
{
"SubscriptionType": "SNS",
"Address": "arn:aws:sns:eu-west-1:123456789012:budget-alerts"
}
]
},
{
"Notification": {
"NotificationType": "FORECASTED",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 100,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [
{
"SubscriptionType": "EMAIL",
"Address": "cto@company.com"
}
]
}
]
Culture FinOps et gouvernance
Chargeback vs Showback
Showback : Transparence coûts, pas de facturation
# showback_report.py - email hebdomadaire équipes
def generate_showback_report(team):
costs = get_team_costs(team, days=7)
report = f"""
FinOps Weekly Report - {team}
Last 7 days costs: ${costs['total']:.2f}
Breakdown:
- Compute (EC2/EKS): ${costs['compute']:.2f}
- Storage (EBS/S3): ${costs['storage']:.2f}
- Database (RDS): ${costs['database']:.2f}
- Networking: ${costs['network']:.2f}
Trend: {costs['trend']}% vs last week
Top 5 resources:
{format_top_resources(costs['top_resources'])}
Optimization opportunities:
- {len(costs['recommendations'])} rightsizing recommendations
- Potential monthly savings: ${costs['potential_savings']:.2f}
View detailed breakdown: https://finops.company.com/teams/{team}
"""
send_email(f"{team}@company.com", "Weekly FinOps Report", report)
Chargeback : Facturation réelle aux équipes
# chargeback_invoice.py - mensuel
def generate_chargeback_invoice(team, month):
costs = get_team_costs(team, month=month)
# Appliquer markup (overhead infra)
markup = 1.15 # 15% overhead
total_with_markup = costs['total'] * markup
invoice = {
'team': team,
'period': month,
'subtotal': costs['total'],
'markup': costs['total'] * 0.15,
'total': total_with_markup,
'cost_center': get_cost_center(team)
}
# Export vers ERP
export_to_erp(invoice)
return invoice
FinOps KPIs
# finops_kpis.py - dashboard exécutif
def calculate_finops_kpis():
return {
# Coût unitaire
'cost_per_customer': total_costs / total_customers,
'cost_per_transaction': total_costs / total_transactions,
'cost_per_api_call': total_costs / total_api_calls,
# Efficience
'compute_utilization': used_compute / provisioned_compute,
'storage_utilization': used_storage / provisioned_storage,
'waste_percentage': wasted_spend / total_spend,
# Couverture
'ri_coverage': reserved_hours / total_hours,
'spot_usage': spot_hours / total_hours,
# Gouvernance
'tagged_resources': tagged / total_resources,
'budget_adherence': actual_spend / budgeted_spend
}
Checklist FinOps
✅ Phase 1 : Visibilité (Mois 1)
- Tagging strategy définie et appliquée
- Compliance tagging ≥90%
- Cost Explorer configuré
- Dashboards billing créés
- Export données vers data lake
✅ Phase 2 : Analyse (Mois 2)
- Analyse utilisation EC2/RDS
- Rightsizing recommendations
- Storage optimization (EBS/S3)
- Identification ressources idle
- Quick wins implémentés (20-30% savings)
✅ Phase 3 : Optimisation (Mois 3-4)
- Kubecost déployé (si K8s)
- RI/Savings Plans achetés (60-70% base)
- Spot instances architecture
- Budgets et alertes actifs
- Showback rapports hebdomadaires
✅ Phase 4 : Gouvernance (Mois 5-6)
- Policies automatiques (tag enforcement)
- Chargeback implémenté
- FinOps reviews mensuelles
- KPIs suivis et reportés
- Culture FinOps établie
Conclusion
FinOps devient essentiel en 2026 avec budgets cloud croissants. Tagging, rightsizing, Reserved Instances et Kubecost permettent d'économiser 30-50% tout en maintenant performance et agilité.
Points clés :
- Tagging = fondation visibilité
- Rightsizing = quick wins 20-30%
- Kubecost = FinOps Kubernetes essentiel
- RI/Savings Plans = 40-70% discount base load
- Spot = 70-90% discount workloads flexibles
Gains typiques :
- Économies : 30-50% budget cloud
- ROI : 2-6 mois
- Visibilité : 100% ressources taggées
- Efficience : +40% utilisation compute
- Waste : -80% ressources idle
Actions prioritaires :
- Implémenter tagging obligatoire
- Analyse rightsizing EC2/RDS
- Déployer Kubecost (si K8s)
- Acheter RI/Savings Plans base load
- Architecture Spot instances


