Hi! I'm

Federico Calò

Software Developer | Technical Writer

I create modern web applications and custom digital tools to help businesses grow through technological innovation. My passion is combining computer science and economics to generate real value.

Contact Me

About Me

My passion for computer science was born at the Technical Commercial Institute of Maglie, where I discovered the power of programming and the fascination of creating digital solutions. From the start, I understood that computer science was not just code, but an extraordinary tool for turning ideas into reality.

During my studies in Business Information Systems, I began to interweave computer science and economics, understanding how technology can be the engine of growth for any business. This vision accompanied me to the University of Bari, where I obtained my degree in Computer Science, deepening my technical skills and passion for software development.

Today I put this experience at the service of businesses, professionals and startups, creating tailor-made digital solutions that automate processes, optimize resources and open new business opportunities. Because true innovation begins when technology meets the real needs of people.

My Skills

Data Analysis & Predictive Models

I transform data into strategic insights with in-depth analysis and predictive models for informed decisions

Process Automation

I create custom tools that automate repetitive operations and free up time for value-added activities

Custom Systems

I develop tailor-made software systems, from platform integrations to customized dashboards

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Che tu gestisca un negozio, uno studio professionale o un'azienda, posso aiutarti a sfruttare le potenzialità dell'informatica per lavorare meglio, più velocemente e in modo più intelligente.

Parliamone Insieme →

Join the Community

Join the developer community where we discuss software, AI, architecture and DevOps. Share ideas, ask questions and grow with us.

Channel

FC Dev Blog

Get notifications on new articles, complete series, weekly tips and featured tools. Bilingual IT/EN content directly in your Telegram.

New articles as they are published
Weekly tips and code snippets
Polls on future topics

Subscribe to Channel

Group

FC Dev Community

A bilingual IT/EN community for developers. Discussions, Q&A, mutual help and networking with other professionals.

Discussions on articles and technologies
Coding help and code review
Job opportunities and collaboration

Join the Group

Discussion Topics

View

Master SQL

RoadMap.sh

November 2024

View

Oracle Certified Foundations Associate

Oracle

October 2024

View

People Leadership Credential

Connect

September 2024

💻 Languages & Technologies

Java

Python

JavaScript

Angular

React

TypeScript

SQL

PHP

CSS/SCSS

Node.js

Docker

Git

💼

12/2024 - Present

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italy · Hybrid Analysis and development of computer systems through the use of Java and Quarkus in Health and Public Sector. Continuous training on modern technologies for creating customized and efficient software solutions and on agents.

💼

06/2022 - 12/2024

Software analyst and Back End Developer Associate Consultant

Links Management and Technology SpA

Experience analyzing as-is software systems and ETL flows using PowerCenter. Completed Spring Boot training for developing modern and scalable backend applications. Backend developer specialized in Spring Boot, with experience in database design, analysis, development and testing of assigned tasks.

💼

02/2021 - 10/2021

Software programmer

Adesso.it (prima era WebScience srl)

Experience in AS-IS and TO-BE analysis, SEO evolutions and website evolutions to improve user performance and engagement.

🎓

2018 - 2025

Degree in Computer Science

University of Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Corporate Information Systems

Technical Commercial Institute of Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

The Fundamental Problem of Machine Learning

The bias-variance tradeoff is the most important concept for understanding why an ML model works or fails. Every model has two sources of error: bias (how much the model's assumptions deviate from reality) and variance (how sensitive the model is to fluctuations in training data). The goal is finding the optimal balance point.

Overfitting occurs when the model memorizes training data including noise, achieving excellent performance on the training set but poor performance on new data. Underfitting occurs when the model is too simple to capture the real patterns in the data. Recognizing and solving these problems is one of the most valuable skills in ML.

What You Will Learn in This Article

Bias-Variance tradeoff and how to diagnose it
Signs of overfitting and underfitting
Learning curves for diagnosis
Cross-validation strategies
L1 (Lasso) and L2 (Ridge) regularization
Early stopping and data augmentation

Diagnosing Overfitting and Underfitting

The most direct way to diagnose overfitting and underfitting is comparing performance on the training set and test set. If the model has high performance on training but low on test, it is overfitting. If it has low performance on both, it is underfitting. Learning curves visualize how performance changes with varying data amounts or model complexity.

Python — Diagnosis with Learning Curves

from sklearn.model_selection import learning_curve, validation_curve
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_breast_cancer
import numpy as np

# Dataset
data = load_breast_cancer()
X, y = data.data, data.target

# Learning curve: performance vs training set size
train_sizes, train_scores, val_scores = learning_curve(
    DecisionTreeClassifier(random_state=42),
    X, y,
    train_sizes=np.linspace(0.1, 1.0, 10),
    cv=5,
    scoring='accuracy',
    n_jobs=-1
)

print("Learning Curve (Decision Tree without limits):")
print(f"{'Train Size':<12s} {'Train Acc':<12s} {'Val Acc':<12s} {'Gap':<8s}")
for size, train, val in zip(
    train_sizes,
    train_scores.mean(axis=1),
    val_scores.mean(axis=1)
):
    gap = train - val
    status = "OVERFIT" if gap > 0.05 else "OK"
    print(f"{size:<12d} {train:.3f}        {val:.3f}        {gap:.3f} {status}")

# Validation curve: performance vs model complexity
param_range = range(1, 20)
train_scores_vc, val_scores_vc = validation_curve(
    DecisionTreeClassifier(random_state=42),
    X, y,
    param_name='max_depth',
    param_range=param_range,
    cv=5,
    scoring='accuracy',
    n_jobs=-1
)

print("\nValidation Curve (max_depth):")
best_depth = 1
best_val = 0
for depth, train, val in zip(
    param_range,
    train_scores_vc.mean(axis=1),
    val_scores_vc.mean(axis=1)
):
    if val > best_val:
        best_val = val
        best_depth = depth
    print(f"  depth={depth:<3d} train={train:.3f} val={val:.3f}")

print(f"\nBest max_depth: {best_depth} (val accuracy: {best_val:.3f})")

Cross-Validation: Robust Evaluation

Cross-validation is the standard technique for reliably estimating generalization performance. K-Fold CV divides the dataset into K equal parts: at each iteration, one part is used as test and the remaining K-1 as training. It repeats K times and computes the average performance. Stratified K-Fold maintains class proportions in each fold, essential for imbalanced datasets.

Python — Cross-Validation Strategies

from sklearn.model_selection import (
    KFold, StratifiedKFold, RepeatedStratifiedKFold,
    cross_val_score, cross_validate
)
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.datasets import load_breast_cancer
import numpy as np

data = load_breast_cancer()
X, y = data.data, data.target

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('clf', RandomForestClassifier(n_estimators=100, random_state=42))
])

# CV strategies
strategies = {
    '5-Fold': KFold(n_splits=5, shuffle=True, random_state=42),
    'Stratified 5-Fold': StratifiedKFold(n_splits=5, shuffle=True, random_state=42),
    'Repeated Strat 5x3': RepeatedStratifiedKFold(
        n_splits=5, n_repeats=3, random_state=42
    )
}

for name, cv in strategies.items():
    scores = cross_val_score(pipeline, X, y, cv=cv, scoring='accuracy')
    print(f"{name:<25s}: {scores.mean():.4f} (+/- {scores.std():.4f})")

# cross_validate for multiple metrics
results = cross_validate(
    pipeline, X, y,
    cv=StratifiedKFold(5, shuffle=True, random_state=42),
    scoring=['accuracy', 'precision', 'recall', 'f1'],
    return_train_score=True
)

print("\ncross_validate details:")
for metric in ['accuracy', 'precision', 'recall', 'f1']:
    train = results[f'train_{metric}'].mean()
    test = results[f'test_{metric}'].mean()
    gap = train - test
    print(f"  {metric:<12s}: train={train:.3f} test={test:.3f} gap={gap:.3f}")

Regularization: L1 (Lasso) and L2 (Ridge)

Regularization adds a penalty term to the cost function to discourage overly complex models. L2 (Ridge) adds the sum of squared weights: it shrinks all weights toward zero but never zeroes them out. L1 (Lasso) adds the sum of absolute weight values: it can completely zero out some weights, implicitly performing feature selection. Elastic Net combines L1 and L2, controlling the mix with the l1_ratio parameter.

The parameter alpha (or C=1/alpha in LogisticRegression) controls regularization strength: high alpha penalizes more (simpler model, underfitting risk), low alpha penalizes less (more complex model, overfitting risk).

Python — Regularization Comparison

from sklearn.linear_model import Ridge, Lasso, ElasticNet, LogisticRegression
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.datasets import load_breast_cancer
import numpy as np

data = load_breast_cancer()
X, y = data.data, data.target

# Regularization comparison for classification
regularizations = {
    'No Reg (C=1e6)': LogisticRegression(C=1e6, max_iter=10000, random_state=42),
    'L2 Weak (C=10)': LogisticRegression(C=10, penalty='l2', max_iter=10000, random_state=42),
    'L2 Strong (C=0.01)': LogisticRegression(C=0.01, penalty='l2', max_iter=10000, random_state=42),
    'L1 (C=1)': LogisticRegression(C=1, penalty='l1', solver='saga', max_iter=10000, random_state=42),
    'ElasticNet': LogisticRegression(C=1, penalty='elasticnet', solver='saga',
                                     l1_ratio=0.5, max_iter=10000, random_state=42)
}

print("Regularization Comparison:")
for name, model in regularizations.items():
    pipeline = Pipeline([('scaler', StandardScaler()), ('clf', model)])
    scores = cross_val_score(pipeline, X, y, cv=5, scoring='accuracy')

    # Count non-zero coefficients (after fit)
    pipeline.fit(X, y)
    n_nonzero = np.sum(np.abs(pipeline.named_steps['clf'].coef_) > 1e-5)

    print(f"  {name:<22s}: acc={scores.mean():.3f} active_features={n_nonzero}/{X.shape[1]}")

Early Stopping and Data Augmentation

Early stopping is a regularization technique for iteratively trained models (gradient boosting, neural networks): it monitors validation set performance at each epoch and stops training when performance stops improving. It prevents overfitting without manually choosing the number of iterations.

Data augmentation is the most effective strategy against overfitting when data is scarce: generating new training samples through label-preserving transformations. For images: rotations, flips, crops, color variations. For text: synonyms, back-translation. For tabular data: SMOTE for imbalanced data or adding Gaussian noise.

Rule of thumb: If the gap between training accuracy and validation accuracy exceeds 5%, the model is probably overfitting. If validation accuracy is below 70% for a not-too-difficult problem, the model is probably underfitting. Learning curves are the most informative diagnostic tool.

Key Takeaways

High bias = underfitting (too simple); High variance = overfitting (too complex)
Learning curves visualize the gap between training and validation performance
Cross-validation (Stratified K-Fold) is the gold standard for evaluation
L2 (Ridge) shrinks all weights; L1 (Lasso) zeros out some weights (implicit feature selection)
Early stopping halts training when validation score stops improving
More data = less overfitting: data augmentation helps when the dataset is small