Hi! I'm

Federico Calò

Software Developer | Technical Writer

I create modern web applications and custom digital tools to help businesses grow through technological innovation. My passion is combining computer science and economics to generate real value.

Contact Me

About Me

My passion for computer science was born at the Technical Commercial Institute of Maglie, where I discovered the power of programming and the fascination of creating digital solutions. From the start, I understood that computer science was not just code, but an extraordinary tool for turning ideas into reality.

During my studies in Business Information Systems, I began to interweave computer science and economics, understanding how technology can be the engine of growth for any business. This vision accompanied me to the University of Bari, where I obtained my degree in Computer Science, deepening my technical skills and passion for software development.

Today I put this experience at the service of businesses, professionals and startups, creating tailor-made digital solutions that automate processes, optimize resources and open new business opportunities. Because true innovation begins when technology meets the real needs of people.

My Skills

Data Analysis & Predictive Models

I transform data into strategic insights with in-depth analysis and predictive models for informed decisions

Process Automation

I create custom tools that automate repetitive operations and free up time for value-added activities

Custom Systems

I develop tailor-made software systems, from platform integrations to customized dashboards

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Che tu gestisca un negozio, uno studio professionale o un'azienda, posso aiutarti a sfruttare le potenzialità dell'informatica per lavorare meglio, più velocemente e in modo più intelligente.

Parliamone Insieme →

Join the Community

Join the developer community where we discuss software, AI, architecture and DevOps. Share ideas, ask questions and grow with us.

Channel

FC Dev Blog

Get notifications on new articles, complete series, weekly tips and featured tools. Bilingual IT/EN content directly in your Telegram.

New articles as they are published
Weekly tips and code snippets
Polls on future topics

Subscribe to Channel

Group

FC Dev Community

A bilingual IT/EN community for developers. Discussions, Q&A, mutual help and networking with other professionals.

Discussions on articles and technologies
Coding help and code review
Job opportunities and collaboration

Join the Group

Discussion Topics

View

Master SQL

RoadMap.sh

November 2024

View

Oracle Certified Foundations Associate

Oracle

October 2024

View

People Leadership Credential

Connect

September 2024

💻 Languages & Technologies

Java

Python

JavaScript

Angular

React

TypeScript

SQL

PHP

CSS/SCSS

Node.js

Docker

Git

💼

12/2024 - Present

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italy · Hybrid Analysis and development of computer systems through the use of Java and Quarkus in Health and Public Sector. Continuous training on modern technologies for creating customized and efficient software solutions and on agents.

💼

06/2022 - 12/2024

Software analyst and Back End Developer Associate Consultant

Links Management and Technology SpA

Experience analyzing as-is software systems and ETL flows using PowerCenter. Completed Spring Boot training for developing modern and scalable backend applications. Backend developer specialized in Spring Boot, with experience in database design, analysis, development and testing of assigned tasks.

💼

02/2021 - 10/2021

Software programmer

Adesso.it (prima era WebScience srl)

Experience in AS-IS and TO-BE analysis, SEO evolutions and website evolutions to improve user performance and engagement.

🎓

2018 - 2025

Degree in Computer Science

University of Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Corporate Information Systems

Technical Commercial Institute of Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

Introduction: From Sample to Population

Inferential statistics allows us to draw conclusions about an entire population by observing only a sample. In ML, this is fundamental: we train on a training set (sample) and want the model to work on unseen data (population). Confidence intervals, hypothesis tests, and A/B testing are indispensable tools for every data scientist.

What You Will Learn

Standard error and sampling distribution
Confidence intervals: what they really mean
Hypothesis testing: null hypothesis, p-value, Type I and Type II errors
T-test and Chi-square test
A/B testing: setup, power analysis, early stopping
Effect size and why the p-value is not enough

Sampling Distribution and Standard Error

The sampling distribution of the mean describes how the sample mean varies if we repeat the experiment many times. By the CLT:

\\bar{X} \\sim \\mathcal{N}\\left(\\mu, \\frac{\\sigma^2}{n}\\right)

The standard error (SE) is the standard deviation of the sampling distribution:

\\text{SE} = \\frac{\\sigma}{\\sqrt{n}} \\approx \\frac{s}{\\sqrt{n}}

where $s$ is the sample standard deviation. SE decreases with $\\sqrt{n}$ : to halve the uncertainty you need 4 times more data.

Confidence Intervals

A 95% confidence interval for the mean is:

\\text{CI}_{95\\%} = \\bar{x} \\pm z_{0.025} \\cdot \\text{SE} = \\bar{x} \\pm 1.96 \\cdot \\frac{s}{\\sqrt{n}}

With small samples ( $n < 30$ ), Student's $t$ -distribution is used instead of the normal:

\\text{CI}_{95\\%} = \\bar{x} \\pm t_{0.025, n-1} \\cdot \\frac{s}{\\sqrt{n}}

What it really means: a 95% CI does NOT mean "there is a 95% probability the true value is in the interval." It means: if we repeated the experiment infinitely many times, 95% of the calculated intervals would contain the true value. The difference is subtle but crucial.


import numpy as np
from scipy import stats

# Sample of accuracies from 10 experiments
accuracies = np.array([0.92, 0.89, 0.91, 0.93, 0.90, 0.88, 0.91, 0.94, 0.90, 0.92])

n = len(accuracies)
mean = np.mean(accuracies)
se = stats.sem(accuracies)  # Standard Error

# 95% CI with t-distribution
t_critical = stats.t.ppf(0.975, df=n-1)
ci_lower = mean - t_critical * se
ci_upper = mean + t_critical * se

print(f"Mean: {mean:.4f}")
print(f"SE: {se:.4f}")
print(f"95% CI: [{ci_lower:.4f}, {ci_upper:.4f}]")

# Quick method with scipy
ci = stats.t.interval(0.95, df=n-1, loc=mean, scale=se)
print(f"95% CI (scipy): [{ci[0]:.4f}, {ci[1]:.4f}]")

Hypothesis Testing

A hypothesis test evaluates whether observed data is compatible with a hypothesis:

$H_0$ (null hypothesis): no effect (e.g., two models have the same accuracy)
$H_1$ (alternative hypothesis): there is an effect

Test Statistic and P-Value

To compare a sample mean with a known value, the t-statistic is computed:

t = \\frac{\\bar{x} - \\mu_0}{s / \\sqrt{n}}

The p-value is the probability of observing a result this extreme (or more) if $H_0$ were true. If $p < \\alpha$ (typically 0.05), we reject $H_0$ .

Type I and Type II Errors

\\alpha = P(\\text{reject } H_0 | H_0 \\text{ true}) \\quad \\text{(false positive)}

\\beta = P(\\text{fail to reject } H_0 | H_0 \\text{ false}) \\quad \\text{(false negative)}

The power of a test is $1 - \\beta$ : the probability of detecting a real effect.

Two-Sample T-Test

To compare two models, we use the independent two-sample t-test:

t = \\frac{\\bar{x}_1 - \\bar{x}_2}{\\sqrt{\\frac{s_1^2}{n_1} + \\frac{s_2^2}{n_2}}}


import numpy as np
from scipy import stats

# Model A vs Model B: accuracies over 15 runs
np.random.seed(42)
model_a = np.array([0.92, 0.89, 0.91, 0.93, 0.90, 0.88, 0.91, 0.94,
                     0.90, 0.92, 0.91, 0.89, 0.93, 0.90, 0.91])
model_b = np.array([0.94, 0.93, 0.95, 0.92, 0.94, 0.91, 0.93, 0.95,
                     0.93, 0.94, 0.92, 0.93, 0.94, 0.93, 0.94])

# Two-sample t-test
t_stat, p_value = stats.ttest_ind(model_a, model_b)
print(f"Model A: mean={model_a.mean():.4f}, std={model_a.std():.4f}")
print(f"Model B: mean={model_b.mean():.4f}, std={model_b.std():.4f}")
print(f"t-statistic: {t_stat:.4f}")
print(f"p-value: {p_value:.6f}")
print(f"Significant (alpha=0.05): {p_value < 0.05}")

Effect Size: Beyond the P-Value

The p-value tells us whether an effect is statistically significant, but not how large it is. Effect size (Cohen's d) measures the magnitude:

d = \\frac{\\bar{x}_1 - \\bar{x}_2}{s_{\\text{pooled}}} \\quad \\text{where} \\quad s_{\\text{pooled}} = \\sqrt{\\frac{s_1^2 + s_2^2}{2}}

Interpretation: $d \\approx 0.2$ small, $d \\approx 0.5$ medium, $d \\approx 0.8$ large.

A/B Testing for ML

A/B testing compares two variants (A = control, B = treatment) to determine which performs better. The setup requires:

Define the metric: click-through rate, conversion, accuracy
Calculate the required sample size (power analysis)
Randomize users into groups
Collect data for the predetermined duration
Analyze with the appropriate statistical test

Power Analysis: How Many Samples Are Needed?

Power analysis calculates the required sample size to detect an effect of size $d$ with power $1-\\beta$ :

n \\geq \\left(\\frac{z_{\\alpha/2} + z_{\\beta}}{d}\\right)^2


import numpy as np
from scipy import stats

def power_analysis(effect_size, alpha=0.05, power=0.8):
    """Calculate required sample size per group."""
    z_alpha = stats.norm.ppf(1 - alpha/2)
    z_beta = stats.norm.ppf(power)
    n = ((z_alpha + z_beta) / effect_size) ** 2
    return int(np.ceil(n))

# Scenario: detect a 2% accuracy improvement
# Base accuracy: 90%, target: 92%, estimated std: 5%
effect_size = 0.02 / 0.05  # Cohen's d = 0.4
n_per_group = power_analysis(effect_size)
print(f"Effect size (Cohen's d): {effect_size:.2f}")
print(f"Sample size per group: {n_per_group}")

# Simulated A/B test
np.random.seed(42)
n = n_per_group
group_a = np.random.normal(0.90, 0.05, n)  # Control
group_b = np.random.normal(0.92, 0.05, n)  # Treatment

t_stat, p_value = stats.ttest_ind(group_a, group_b)
diff = group_b.mean() - group_a.mean()
s_pooled = np.sqrt((group_a.var() + group_b.var()) / 2)
cohens_d = diff / s_pooled

print(f"\nA/B Test Results:")
print(f"  Group A: {group_a.mean():.4f}")
print(f"  Group B: {group_b.mean():.4f}")
print(f"  Difference: {diff:.4f}")
print(f"  p-value: {p_value:.6f}")
print(f"  Cohen's d: {cohens_d:.4f}")
print(f"  Significant: {p_value < 0.05}")

Multiple Testing Correction

When running many tests simultaneously (e.g., comparing 10 models), the probability of at least one false positive increases. Bonferroni correction divides the significance level by the number of tests:

\\alpha_{\\text{corrected}} = \\frac{\\alpha}{m}

where $m$ is the number of tests. It is conservative; for a less strict approach, the Benjamini-Hochberg procedure controls the False Discovery Rate (FDR).

Summary

Key Takeaways

Standard Error: $\\text{SE} = s / \\sqrt{n}$ - uncertainty decreases with more data
95% CI: $\\bar{x} \\pm 1.96 \\cdot \\text{SE}$ - it is a frequency, not a probability
P-value: probability of data this extreme if $H_0$ is true
Effect size (Cohen's d): measures how large the effect is, not just significance
Power analysis: calculate how many samples are needed before collecting data
Bonferroni: correct $\\alpha$ for multiple tests

In the Next Article: we will explore the mathematics of Transformers. Self-attention, scaled dot-product, multi-head attention, positional encoding: the formulas that revolutionized NLP and AI.