Hi! I'm

Federico Calò

Software Developer | Technical Writer

I create modern web applications and custom digital tools to help businesses grow through technological innovation. My passion is combining computer science and economics to generate real value.

Contact Me

About Me

My passion for computer science was born at the Technical Commercial Institute of Maglie, where I discovered the power of programming and the fascination of creating digital solutions. From the start, I understood that computer science was not just code, but an extraordinary tool for turning ideas into reality.

During my studies in Business Information Systems, I began to interweave computer science and economics, understanding how technology can be the engine of growth for any business. This vision accompanied me to the University of Bari, where I obtained my degree in Computer Science, deepening my technical skills and passion for software development.

Today I put this experience at the service of businesses, professionals and startups, creating tailor-made digital solutions that automate processes, optimize resources and open new business opportunities. Because true innovation begins when technology meets the real needs of people.

My Skills

Data Analysis & Predictive Models

I transform data into strategic insights with in-depth analysis and predictive models for informed decisions

Process Automation

I create custom tools that automate repetitive operations and free up time for value-added activities

Custom Systems

I develop tailor-made software systems, from platform integrations to customized dashboards

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Che tu gestisca un negozio, uno studio professionale o un'azienda, posso aiutarti a sfruttare le potenzialità dell'informatica per lavorare meglio, più velocemente e in modo più intelligente.

Parliamone Insieme →

Join the Community

Join the developer community where we discuss software, AI, architecture and DevOps. Share ideas, ask questions and grow with us.

Channel

FC Dev Blog

Get notifications on new articles, complete series, weekly tips and featured tools. Bilingual IT/EN content directly in your Telegram.

New articles as they are published
Weekly tips and code snippets
Polls on future topics

Subscribe to Channel

Group

FC Dev Community

A bilingual IT/EN community for developers. Discussions, Q&A, mutual help and networking with other professionals.

Discussions on articles and technologies
Coding help and code review
Job opportunities and collaboration

Join the Group

Discussion Topics

View

Master SQL

RoadMap.sh

November 2024

View

Oracle Certified Foundations Associate

Oracle

October 2024

View

People Leadership Credential

Connect

September 2024

💻 Languages & Technologies

Java

Python

JavaScript

Angular

React

TypeScript

SQL

PHP

CSS/SCSS

Node.js

Docker

Git

💼

12/2024 - Present

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italy · Hybrid Analysis and development of computer systems through the use of Java and Quarkus in Health and Public Sector. Continuous training on modern technologies for creating customized and efficient software solutions and on agents.

💼

06/2022 - 12/2024

Software analyst and Back End Developer Associate Consultant

Links Management and Technology SpA

Experience analyzing as-is software systems and ETL flows using PowerCenter. Completed Spring Boot training for developing modern and scalable backend applications. Backend developer specialized in Spring Boot, with experience in database design, analysis, development and testing of assigned tasks.

💼

02/2021 - 10/2021

Software programmer

Adesso.it (prima era WebScience srl)

Experience in AS-IS and TO-BE analysis, SEO evolutions and website evolutions to improve user performance and engagement.

🎓

2018 - 2025

Degree in Computer Science

University of Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Corporate Information Systems

Technical Commercial Institute of Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

Introduction: Compressing Data Without Losing Information

Real-world datasets often have hundreds or thousands of features. Many of these are redundant or correlated with each other. Dimensionality reduction allows compressing data into a lower-dimensional space while retaining most of the useful information. The most used algorithm is PCA (Principal Component Analysis), which relies entirely on eigenvalues and eigenvectors of the covariance matrix.

What You Will Learn

The curse of dimensionality and why reduction matters
Covariance matrix: understanding correlations
PCA: finding the directions of maximum variance
Explained variance and choosing the number of components
t-SNE and UMAP for non-linear visualization
Full implementation in NumPy and scikit-learn

The Curse of Dimensionality

In high-dimensional spaces, data becomes sparse: all points are approximately equidistant. This makes distance metrics less useful and causes overfitting in models. PCA fights this by projecting data onto the most informative directions.

Covariance Matrix

The covariance matrix $\\mathbf{C}$ captures the correlations between all pairs of features. For a centered dataset (zero mean) $\\mathbf{X} \\in \\mathbb{R}^{n \\times d}$ :

\\mathbf{C} = \\frac{1}{n-1} \\mathbf{X}^T \\mathbf{X}

Each element $C_{ij}$ is the covariance between features $i$ and $j$ :

C_{ij} = \\frac{1}{n-1} \\sum_{k=1}^{n} (x_{ki} - \\bar{x}_i)(x_{kj} - \\bar{x}_j)

The diagonal contains each feature's variance, off-diagonal elements contain covariances. If $C_{ij} > 0$ , features are positively correlated; if $C_{ij} = 0$ , they are uncorrelated.

PCA: Mathematical Derivation

PCA seeks the directions along which data has maximum variance. The first principal component $\\mathbf{w}_1$ is the unit vector that maximizes the variance of the projection:

\\mathbf{w}_1 = \\arg\\max_{\\|\\mathbf{w}\\| = 1} \\text{Var}(\\mathbf{X}\\mathbf{w}) = \\arg\\max_{\\|\\mathbf{w}\\| = 1} \\mathbf{w}^T \\mathbf{C} \\mathbf{w}

Using Lagrange multipliers, it can be shown that the solution is the eigenvector corresponding to the largest eigenvalue of $\\mathbf{C}$ :

\\mathbf{C} \\mathbf{w}_i = \\lambda_i \\mathbf{w}_i

where $\\lambda_1 \\geq \\lambda_2 \\geq \\cdots \\geq \\lambda_d \\geq 0$ are the eigenvalues in decreasing order. The eigenvalue $\\lambda_i$ is exactly the variance captured by the $i$ -th principal component.

Projection and Reconstruction

To reduce to $k$ dimensions, we project onto the first $k$ eigenvectors:

\\mathbf{Z} = \\mathbf{X} \\mathbf{W}_k \\quad \\text{where} \\quad \\mathbf{W}_k = [\\mathbf{w}_1, \\mathbf{w}_2, \\ldots, \\mathbf{w}_k] \\in \\mathbb{R}^{d \\times k}

The approximate reconstruction is:

\\hat{\\mathbf{X}} = \\mathbf{Z} \\mathbf{W}_k^T

Explained Variance

The explained variance by the first $k$ components is:

\\text{Explained variance} = \\frac{\\sum_{i=1}^{k} \\lambda_i}{\\sum_{i=1}^{d} \\lambda_i}

In practice, $k$ is chosen to retain 95% or 99% of total variance.


import numpy as np

# Synthetic dataset: 200 samples, 5 features (correlated)
np.random.seed(42)
n, d = 200, 5
X = np.random.randn(n, 2) @ np.array([[2, 1, 0.5, 0.3, 0.1],
                                        [0.5, 1.5, 1, 0.2, 0.8]])
X += np.random.randn(n, d) * 0.3  # Noise

# PCA from scratch
# 1. Center the data
X_centered = X - X.mean(axis=0)

# 2. Covariance matrix
C = np.cov(X_centered, rowvar=False)
print(f"Covariance matrix:\n{np.round(C, 3)}\n")

# 3. Eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eigh(C)
# Sort in descending order
idx = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]

print(f"Eigenvalues: {np.round(eigenvalues, 4)}")

# 4. Explained variance
var_explained = eigenvalues / eigenvalues.sum()
cumulative = np.cumsum(var_explained)
for i in range(d):
    print(f"PC{i+1}: {var_explained[i]*100:.1f}% (cumulative: {cumulative[i]*100:.1f}%)")

# 5. Project to 2D
k = 2
W_k = eigenvectors[:, :k]
Z = X_centered @ W_k
print(f"\nOriginal shape: {X.shape} -> Reduced: {Z.shape}")

# 6. Reconstruction error
X_reconstructed = Z @ W_k.T + X.mean(axis=0)
reconstruction_error = np.mean((X - X_reconstructed)**2)
print(f"Reconstruction error (MSE): {reconstruction_error:.6f}")

PCA with Scikit-Learn


from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
import numpy as np

# Standardization (important! PCA is scale-sensitive)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Automatic PCA
pca = PCA(n_components=0.95)  # Keep 95% variance
X_pca = pca.fit_transform(X_scaled)

print(f"Components selected: {pca.n_components_}")
print(f"Explained variance: {pca.explained_variance_ratio_}")
print(f"Shape: {X.shape} -> {X_pca.shape}")

Beyond PCA: t-SNE and UMAP

PCA is limited to linear transformations. For non-linear structures in data, methods like t-SNE and UMAP are used.

t-SNE

t-SNE (t-distributed Stochastic Neighbor Embedding) preserves local distances: nearby points in the original space remain close in the 2D representation. It minimizes the KL divergence between similarity distributions in the original and reduced spaces:

p_{j|i} = \\frac{\\exp(-\\|\\mathbf{x}_i - \\mathbf{x}_j\\|^2 / 2\\sigma_i^2)}{\\sum_{k \\neq i} \\exp(-\\|\\mathbf{x}_i - \\mathbf{x}_k\\|^2 / 2\\sigma_i^2)}

q_{ij} = \\frac{(1 + \\|\\mathbf{y}_i - \\mathbf{y}_j\\|^2)^{-1}}{\\sum_{k \\neq l} (1 + \\|\\mathbf{y}_k - \\mathbf{y}_l\\|^2)^{-1}}

UMAP

UMAP (Uniform Manifold Approximation and Projection) is faster than t-SNE and better preserves global structure. It is based on algebraic topology and fuzzy graph theory.

When to use which: PCA for preprocessing (reduce dimensions before a classifier, denoising); t-SNE/UMAP for 2D/3D visualization (explore clusters, outliers). PCA is invertible and interpretable, t-SNE/UMAP are not.

Application: PCA for ML Preprocessing


from sklearn.datasets import load_digits
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import numpy as np

# Digits dataset: 1797 images 8x8 = 64 features
digits = load_digits()
X, y = digits.data, digits.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Without PCA (64 features)
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)

clf_full = LogisticRegression(max_iter=5000)
clf_full.fit(X_train_s, y_train)
acc_full = clf_full.score(X_test_s, y_test)

# With PCA (keep 95% variance)
pca = PCA(n_components=0.95)
X_train_pca = pca.fit_transform(X_train_s)
X_test_pca = pca.transform(X_test_s)

clf_pca = LogisticRegression(max_iter=5000)
clf_pca.fit(X_train_pca, y_train)
acc_pca = clf_pca.score(X_test_pca, y_test)

print(f"Without PCA: {X_train_s.shape[1]} features, Accuracy: {acc_full:.4f}")
print(f"With PCA:    {X_train_pca.shape[1]} features, Accuracy: {acc_pca:.4f}")
print(f"Reduction: {(1 - X_train_pca.shape[1]/X_train_s.shape[1])*100:.0f}% of features")

Summary and Connections to ML

Key Takeaways

PCA: projects onto the first $k$ eigenvectors of the covariance matrix
Eigenvalues $\\lambda_i$ : variance captured by each component
Explained variance: choose $k$ to retain 95%+ of total variance
Standardization: essential before PCA (scale-sensitive)
t-SNE/UMAP: for non-linear 2D/3D visualization
PCA for preprocessing: reduces overfitting and speeds up training

In the Next Article: we will explore loss functions in detail. MSE, Cross-Entropy, Focal Loss, Hinge Loss, and how to choose and create custom ones.