Hi! I'm

Federico Calò

Software Developer | Technical Writer

I create modern web applications and custom digital tools to help businesses grow through technological innovation. My passion is combining computer science and economics to generate real value.

Contact Me

About Me

My passion for computer science was born at the Technical Commercial Institute of Maglie, where I discovered the power of programming and the fascination of creating digital solutions. From the start, I understood that computer science was not just code, but an extraordinary tool for turning ideas into reality.

During my studies in Business Information Systems, I began to interweave computer science and economics, understanding how technology can be the engine of growth for any business. This vision accompanied me to the University of Bari, where I obtained my degree in Computer Science, deepening my technical skills and passion for software development.

Today I put this experience at the service of businesses, professionals and startups, creating tailor-made digital solutions that automate processes, optimize resources and open new business opportunities. Because true innovation begins when technology meets the real needs of people.

My Skills

Data Analysis & Predictive Models

I transform data into strategic insights with in-depth analysis and predictive models for informed decisions

Process Automation

I create custom tools that automate repetitive operations and free up time for value-added activities

Custom Systems

I develop tailor-made software systems, from platform integrations to customized dashboards

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Che tu gestisca un negozio, uno studio professionale o un'azienda, posso aiutarti a sfruttare le potenzialità dell'informatica per lavorare meglio, più velocemente e in modo più intelligente.

Parliamone Insieme →

Join the Community

Join the developer community where we discuss software, AI, architecture and DevOps. Share ideas, ask questions and grow with us.

Channel

FC Dev Blog

Get notifications on new articles, complete series, weekly tips and featured tools. Bilingual IT/EN content directly in your Telegram.

New articles as they are published
Weekly tips and code snippets
Polls on future topics

Subscribe to Channel

Group

FC Dev Community

A bilingual IT/EN community for developers. Discussions, Q&A, mutual help and networking with other professionals.

Discussions on articles and technologies
Coding help and code review
Job opportunities and collaboration

Join the Group

Discussion Topics

View

Master SQL

RoadMap.sh

November 2024

View

Oracle Certified Foundations Associate

Oracle

October 2024

View

People Leadership Credential

Connect

September 2024

💻 Languages & Technologies

Java

Python

JavaScript

Angular

React

TypeScript

SQL

PHP

CSS/SCSS

Node.js

Docker

Git

💼

12/2024 - Present

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italy · Hybrid Analysis and development of computer systems through the use of Java and Quarkus in Health and Public Sector. Continuous training on modern technologies for creating customized and efficient software solutions and on agents.

💼

06/2022 - 12/2024

Software analyst and Back End Developer Associate Consultant

Links Management and Technology SpA

Experience analyzing as-is software systems and ETL flows using PowerCenter. Completed Spring Boot training for developing modern and scalable backend applications. Backend developer specialized in Spring Boot, with experience in database design, analysis, development and testing of assigned tasks.

💼

02/2021 - 10/2021

Software programmer

Adesso.it (prima era WebScience srl)

Experience in AS-IS and TO-BE analysis, SEO evolutions and website evolutions to improve user performance and engagement.

🎓

2018 - 2025

Degree in Computer Science

University of Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Corporate Information Systems

Technical Commercial Institute of Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

K-Nearest Neighbors: The Lazy Classifier

K-Nearest Neighbors (KNN) is one of the most intuitive Machine Learning algorithms: to classify a new point, it looks at the K nearest points in the training dataset and assigns the majority class. It does not build an explicit model during training (that is why it is called a lazy learner): all computational work happens at prediction time, when the algorithm must search for neighbors in the feature space.

KNN's simplicity is both its strength and weakness: it is easy to understand and implement, but becomes slow with large datasets because it must compute the distance to every training point for each new prediction. Data structures like KD-Tree and Ball Tree mitigate this issue but do not eliminate it completely.

What You Will Learn in This Article

How KNN works and distance metrics
How to choose the optimal value of K
K-Means: the most popular clustering algorithm
DBSCAN: density-based clustering
How to evaluate clustering without labels
Silhouette Score and Elbow method

Distance Metrics

KNN relies on the concept of distance between points in the feature space. The choice of metric significantly influences results. Euclidean distance is the most common: the straight line between two points in n-dimensional space. Manhattan distance calculates the sum of absolute differences (like walking in a street grid). Minkowski distance is a generalization of both, controlled by the parameter p.

Python — KNN with K Tuning

from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import numpy as np

# Dataset
wine = load_wine()
X, y = wine.data, wine.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Find optimal K
k_range = range(1, 31)
cv_scores = []

for k in k_range:
    pipeline = Pipeline([
        ('scaler', StandardScaler()),
        ('knn', KNeighborsClassifier(n_neighbors=k))
    ])
    scores = cross_val_score(pipeline, X_train, y_train, cv=5, scoring='accuracy')
    cv_scores.append(scores.mean())

best_k = k_range[np.argmax(cv_scores)]
print(f"Best K: {best_k} with accuracy: {max(cv_scores):.3f}")

# Final model with optimal K
final_pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('knn', KNeighborsClassifier(n_neighbors=best_k, weights='distance'))
])
final_pipeline.fit(X_train, y_train)
test_accuracy = final_pipeline.score(X_test, y_test)
print(f"Test accuracy with K={best_k}: {test_accuracy:.3f}")

K-Means Clustering

K-Means is the most widely used clustering algorithm. It partitions data into K clusters, where each cluster is defined by its centroid (the mean point). The algorithm alternates between two phases: assigning each point to the nearest centroid, and recalculating centroids as the mean of assigned points. It repeats until convergence.

K-Means works well with spherical clusters of similar sizes but has important limitations: it requires K to be specified in advance, is sensitive to centroid initialization (partially solved by K-Means++), and does not handle irregularly shaped or differently sized clusters well.

Python — K-Means with Elbow Method and Silhouette

from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import silhouette_score
import numpy as np

# Example data: customer segmentation
np.random.seed(42)
# 3 natural customer groups
group1 = np.random.randn(100, 2) * 0.5 + [2, 2]    # Budget
group2 = np.random.randn(100, 2) * 0.8 + [7, 7]    # Premium
group3 = np.random.randn(100, 2) * 0.6 + [2, 8]    # Frequent
X = np.vstack([group1, group2, group3])

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Elbow method: find optimal K
inertias = []
silhouettes = []
K_range = range(2, 11)

for k in K_range:
    kmeans = KMeans(n_clusters=k, random_state=42, n_init=10)
    labels = kmeans.fit_predict(X_scaled)
    inertias.append(kmeans.inertia_)
    silhouettes.append(silhouette_score(X_scaled, labels))

# Results
for k, inertia, sil in zip(K_range, inertias, silhouettes):
    marker = " <-- optimal" if k == 3 else ""
    print(f"K={k}: Inertia={inertia:.1f}, Silhouette={sil:.3f}{marker}")

# Final model
kmeans_final = KMeans(n_clusters=3, random_state=42, n_init=10)
clusters = kmeans_final.fit_predict(X_scaled)
print(f"\nCluster distribution: {np.bincount(clusters)}")

DBSCAN: Density-Based Clustering

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering algorithm that does not require specifying the number of clusters in advance. It identifies clusters as dense regions separated by low-density regions. Two parameters control behavior: eps (the neighborhood radius) and min_samples (the minimum number of points to form a cluster).

DBSCAN classifies points into three categories: core points (with at least min_samples neighbors within eps), border points (near a core point but with few neighbors of their own), and noise points (neither core nor border). Unlike K-Means, DBSCAN handles arbitrarily shaped clusters and automatically identifies outliers.

Python — DBSCAN and Hierarchical Clustering

from sklearn.cluster import DBSCAN, AgglomerativeClustering
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import silhouette_score
from sklearn.datasets import make_moons
import numpy as np

# Dataset with non-spherical clusters (moons)
X, y_true = make_moons(n_samples=300, noise=0.1, random_state=42)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# DBSCAN
dbscan = DBSCAN(eps=0.3, min_samples=10)
labels_dbscan = dbscan.fit_predict(X_scaled)
n_clusters = len(set(labels_dbscan)) - (1 if -1 in labels_dbscan else 0)
n_noise = list(labels_dbscan).count(-1)

print(f"DBSCAN: {n_clusters} clusters, {n_noise} noise points")
if n_clusters > 1:
    mask = labels_dbscan != -1
    sil = silhouette_score(X_scaled[mask], labels_dbscan[mask])
    print(f"  Silhouette (excluding noise): {sil:.3f}")

# Hierarchical Clustering (Agglomerative)
agg = AgglomerativeClustering(n_clusters=2, linkage='ward')
labels_agg = agg.fit_predict(X_scaled)
sil_agg = silhouette_score(X_scaled, labels_agg)
print(f"\nHierarchical (Ward): Silhouette = {sil_agg:.3f}")

# K-Means for comparison (struggles with moons)
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=2, random_state=42, n_init=10)
labels_km = kmeans.fit_predict(X_scaled)
sil_km = silhouette_score(X_scaled, labels_km)
print(f"K-Means: Silhouette = {sil_km:.3f}")

Clustering Evaluation Metrics

Evaluating clustering is harder than supervised classification because there are no reference labels. Silhouette Score measures how similar each point is to its own cluster compared to neighboring clusters: ranges from -1 (wrong assignment) to 1 (dense and well-separated clusters). Davies-Bouldin Index measures average similarity between clusters: lower is better. Calinski-Harabasz Index measures the ratio of inter-cluster to intra-cluster dispersion: higher is better.

KNN vs K-Means: Do not confuse them! KNN is a supervised algorithm for classification/regression (uses labels). K-Means is an unsupervised clustering algorithm (does not use labels). The only thing they share is the letter K.

When to Use Each Algorithm

KNN: for classification problems with small-medium datasets and when the relationship between features and target is local and non-linear. K-Means: for segmentation with spherical and balanced clusters. DBSCAN: when clusters have irregular shapes, there are significant outliers, or the number of clusters is unknown. Hierarchical: when you need to understand the grouping hierarchy and the dataset is not too large.

Key Takeaways

KNN classifies based on the K nearest neighbors: simple but slow on large datasets
K selection is done with cross-validation: too low causes overfitting, too high underfitting
K-Means partitions into K clusters with centroids; requires known K and works with spherical clusters
DBSCAN finds arbitrarily shaped clusters based on density and identifies outliers
Silhouette Score is the most used metric for evaluating clustering quality
Feature scaling is essential for all distance-based algorithms