Hi! I'm

Federico Calò

Software Developer | Technical Writer

I create modern web applications and custom digital tools to help businesses grow through technological innovation. My passion is combining computer science and economics to generate real value.

Contact Me

About Me

My passion for computer science was born at the Technical Commercial Institute of Maglie, where I discovered the power of programming and the fascination of creating digital solutions. From the start, I understood that computer science was not just code, but an extraordinary tool for turning ideas into reality.

During my studies in Business Information Systems, I began to interweave computer science and economics, understanding how technology can be the engine of growth for any business. This vision accompanied me to the University of Bari, where I obtained my degree in Computer Science, deepening my technical skills and passion for software development.

Today I put this experience at the service of businesses, professionals and startups, creating tailor-made digital solutions that automate processes, optimize resources and open new business opportunities. Because true innovation begins when technology meets the real needs of people.

My Skills

Data Analysis & Predictive Models

I transform data into strategic insights with in-depth analysis and predictive models for informed decisions

Process Automation

I create custom tools that automate repetitive operations and free up time for value-added activities

Custom Systems

I develop tailor-made software systems, from platform integrations to customized dashboards

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Che tu gestisca un negozio, uno studio professionale o un'azienda, posso aiutarti a sfruttare le potenzialità dell'informatica per lavorare meglio, più velocemente e in modo più intelligente.

Parliamone Insieme →

Join the Community

Join the developer community where we discuss software, AI, architecture and DevOps. Share ideas, ask questions and grow with us.

Channel

FC Dev Blog

Get notifications on new articles, complete series, weekly tips and featured tools. Bilingual IT/EN content directly in your Telegram.

New articles as they are published
Weekly tips and code snippets
Polls on future topics

Subscribe to Channel

Group

FC Dev Community

A bilingual IT/EN community for developers. Discussions, Q&A, mutual help and networking with other professionals.

Discussions on articles and technologies
Coding help and code review
Job opportunities and collaboration

Join the Group

Discussion Topics

View

Master SQL

RoadMap.sh

November 2024

View

Oracle Certified Foundations Associate

Oracle

October 2024

View

People Leadership Credential

Connect

September 2024

💻 Languages & Technologies

Java

Python

JavaScript

Angular

React

TypeScript

SQL

PHP

CSS/SCSS

Node.js

Docker

Git

💼

12/2024 - Present

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italy · Hybrid Analysis and development of computer systems through the use of Java and Quarkus in Health and Public Sector. Continuous training on modern technologies for creating customized and efficient software solutions and on agents.

💼

06/2022 - 12/2024

Software analyst and Back End Developer Associate Consultant

Links Management and Technology SpA

Experience analyzing as-is software systems and ETL flows using PowerCenter. Completed Spring Boot training for developing modern and scalable backend applications. Backend developer specialized in Spring Boot, with experience in database design, analysis, development and testing of assigned tasks.

💼

02/2021 - 10/2021

Software programmer

Adesso.it (prima era WebScience srl)

Experience in AS-IS and TO-BE analysis, SEO evolutions and website evolutions to improve user performance and engagement.

🎓

2018 - 2025

Degree in Computer Science

University of Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Corporate Information Systems

Technical Commercial Institute of Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

Introduction: What Are Neural Networks

Artificial neural networks are the foundation of modern deep learning. Inspired by the structure of the human brain, these computational architectures can learn complex patterns from data through an iterative optimization process called backpropagation. From image classification to machine translation, neural networks power the most advanced artificial intelligence applications in the world.

In this first article of the Deep Learning and Neural Networks series, we will start from the historical origins with the Perceptron (1958) and build up to the fundamental concepts that enable networks to learn: weights, biases, activation functions, gradient descent, and backpropagation. By the end, we will implement a neural network from scratch in Python and PyTorch.

What You Will Learn

The history of neural networks: from the Perceptron to modern Deep Learning
Neural network architecture: input, hidden, and output layers
Activation functions: ReLU, Sigmoid, Tanh and visual comparison
Backpropagation: how the network computes gradients and updates weights
Loss functions: MSE and Cross-Entropy for different tasks
Practical implementation in NumPy and PyTorch

The Perceptron: The First Artificial Neuron

In 1958, Frank Rosenblatt introduced the Perceptron, the first model of an artificial neuron. The idea was simple yet revolutionary: a computational unit that receives numerical inputs, multiplies them by weights, sums the results, and produces a binary output through a threshold function.

Mathematically, the perceptron computes a weighted sum of its inputs plus a bias term, then applies a step function to produce the output:


# Simple Perceptron in Python
import numpy as np

class Perceptron:
    def __init__(self, n_inputs, learning_rate=0.01):
        self.weights = np.random.randn(n_inputs)
        self.bias = 0.0
        self.lr = learning_rate

    def predict(self, x):
        """Forward pass: weighted sum + threshold"""
        linear_output = np.dot(x, self.weights) + self.bias
        return 1 if linear_output >= 0 else 0

    def train(self, X, y, epochs=100):
        """Perceptron learning rule"""
        for _ in range(epochs):
            for xi, yi in zip(X, y):
                prediction = self.predict(xi)
                error = yi - prediction
                self.weights += self.lr * error * xi
                self.bias += self.lr * error

# Example: AND gate
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 0, 0, 1])

p = Perceptron(n_inputs=2)
p.train(X, y, epochs=50)
print([p.predict(xi) for xi in X])  # [0, 0, 0, 1]

The perceptron works perfectly for linearly separable problems like AND and OR. However, in 1969 Minsky and Papert demonstrated that a single perceptron cannot solve the XOR problem, where classes are not separable by a straight line. This discovery slowed neural network research for over a decade, a period known as the AI Winter.

The XOR Limitation and the Need for Deep Learning

The XOR problem proved that multiple layers (hidden layers) were needed to solve non-linear problems. This insight led to the development of Multi-Layer Perceptrons (MLPs) and, decades later, modern deep learning. Today we know that by adding even a single hidden layer with non-linear activation functions, a neural network can approximate any continuous function (Universal Approximation Theorem).

Neural Network Architecture: Layers and Neurons

A neural network is organized into layers of interconnected neurons. The classic architecture comprises three types of layers:

Input Layer: receives raw data. Each neuron represents a dataset feature (e.g., image pixels, text words)
Hidden Layer(s): one or more intermediate layers where processing occurs. Each neuron receives input from the previous layer, applies weights and bias, and passes the result through an activation function
Output Layer: produces the final prediction. For binary classification: 1 neuron with sigmoid. For multi-class: N neurons with softmax

The forward pass is the process by which data flows from input to output through all layers. For each neuron, the computation follows three steps: weighted sum of inputs, bias addition, and activation function application.

Activation Functions: ReLU, Sigmoid, and Tanh

Activation functions introduce non-linearity into the network, enabling it to learn complex relationships in the data. Without them, a network with N layers would be equivalent to a single linear layer, regardless of depth.

Sigmoid

The sigmoid function squashes any value into the range (0, 1). Historically used as the standard activation, today it is primarily employed in the output layer for binary classification. Its main problem is vanishing gradient: for very high or very low values, the gradient becomes nearly zero, drastically slowing down learning.

Tanh

The tanh (hyperbolic tangent) function maps values to the range (-1, 1). Zero-centered, it offers stronger gradients than sigmoid, making it preferable in hidden layers. However, it also suffers from vanishing gradient for extreme values.

ReLU (Rectified Linear Unit)

ReLU is the most widely used activation function in modern deep learning. Its formula is extremely simple: f(x) = max(0, x). The advantages are numerous: efficient computation, no vanishing gradient for positive values, and promotion of sparse representations. The only downside is the dying ReLU problem: neurons that consistently receive negative inputs stop learning entirely.


import numpy as np
import matplotlib.pyplot as plt

# Activation function implementations
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def tanh(x):
    return np.tanh(x)

def relu(x):
    return np.maximum(0, x)

def leaky_relu(x, alpha=0.01):
    return np.where(x > 0, x, alpha * x)

# Derivatives for backpropagation
def sigmoid_derivative(x):
    s = sigmoid(x)
    return s * (1 - s)

def relu_derivative(x):
    return np.where(x > 0, 1, 0)

# Visual comparison
x = np.linspace(-5, 5, 200)
fig, axes = plt.subplots(1, 4, figsize=(16, 4))

for ax, func, name in zip(axes, [sigmoid, tanh, relu, leaky_relu],
                           ['Sigmoid', 'Tanh', 'ReLU', 'Leaky ReLU']):
    ax.plot(x, func(x), linewidth=2)
    ax.set_title(name)
    ax.grid(True, alpha=0.3)
    ax.axhline(y=0, color='k', linewidth=0.5)
    ax.axvline(x=0, color='k', linewidth=0.5)

plt.tight_layout()
plt.savefig('activation_functions.png', dpi=150)

Loss Functions: Measuring Error

The loss function quantifies how much the network's predictions deviate from the actual values. It provides the error signal that guides learning during training.

MSE (Mean Squared Error)

Used for regression problems, MSE computes the average of squared differences between predictions and actual values. It penalizes large errors more heavily, making it sensitive to outliers.

Cross-Entropy

For classification problems, Cross-Entropy measures the distance between the predicted probability distribution and the actual one. For binary classification, Binary Cross-Entropy is used; for multi-class, Categorical Cross-Entropy. Cross-Entropy produces stronger gradients when the network is very confident but wrong, accelerating correction.

Backpropagation: How the Network Learns

Backpropagation is the fundamental algorithm that enables neural networks to learn. Introduced by Rumelhart, Hinton, and Williams in 1986, it applies the chain rule of calculus to compute the gradient of the loss function with respect to every weight in the network.

The process consists of four phases:

Forward Pass: data flows through the network from input to output, computing activations at each layer
Loss Computation: the error between predicted output and actual value is measured
Backward Pass: gradients are computed starting from the output towards the input, propagating the error backwards
Weight Update: each weight is modified in the opposite direction of the gradient, proportionally to the learning rate

Gradient Descent: The Fundamental Optimizer

Gradient Descent updates weights according to the formula: w = w - lr * dL/dw. Modern variants include SGD with momentum (accumulates velocity in the gradient direction), Adam (adaptive learning rate per parameter), and AdamW (Adam with corrected weight decay). Adam is the default optimizer in most deep learning applications.

Complete Implementation: MLP in PyTorch

Let us put everything together by implementing a Multi-Layer Perceptron for handwritten digit classification (MNIST dataset) using PyTorch:


import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# MLP Model Definition
class MLP(nn.Module):
    def __init__(self, input_size=784, hidden_sizes=[256, 128], num_classes=10):
        super().__init__()
        self.flatten = nn.Flatten()
        self.network = nn.Sequential(
            nn.Linear(input_size, hidden_sizes[0]),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(hidden_sizes[0], hidden_sizes[1]),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(hidden_sizes[1], num_classes)
        )

    def forward(self, x):
        x = self.flatten(x)
        return self.network(x)

# Dataset and DataLoader setup
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

train_dataset = datasets.MNIST('./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST('./data', train=False, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=1000, shuffle=False)

# Training
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = MLP().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

for epoch in range(10):
    model.train()
    total_loss = 0
    for batch_x, batch_y in train_loader:
        batch_x, batch_y = batch_x.to(device), batch_y.to(device)
        optimizer.zero_grad()
        output = model(batch_x)
        loss = criterion(output, batch_y)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()

    # Evaluation
    model.eval()
    correct = 0
    with torch.no_grad():
        for batch_x, batch_y in test_loader:
            batch_x, batch_y = batch_x.to(device), batch_y.to(device)
            output = model(batch_x)
            pred = output.argmax(dim=1)
            correct += (pred == batch_y).sum().item()

    accuracy = 100. * correct / len(test_dataset)
    print(f'Epoch {epoch+1}: Loss={total_loss/len(train_loader):.4f}, '
          f'Accuracy={accuracy:.2f}%')

This model achieves approximately 98% accuracy on MNIST after 10 epochs. The network has two hidden layers (256 and 128 neurons), uses ReLU activation, Dropout for regularization, and the Adam optimizer with a learning rate of 0.001.

Deep Learning: Why Multiple Layers Work

Deep learning is distinguished from traditional machine learning by the use of networks with many hidden layers. But why is depth so important?

The answer lies in hierarchical feature composition. Each layer learns to recognize patterns at an increasing level of abstraction:

Layer 1: Detects edges, gradients, and simple textures
Layer 2: Combines edges into geometric shapes (corners, curves)
Layer 3: Recognizes object parts (eyes, wheels, windows)
Layer 4+: Identifies complete objects and composed scenes

This hierarchy of representations is why deep networks such as ResNet (152 layers) can achieve superhuman performance in image classification, while a single layer could never capture the same complexity.

However, depth also brings challenges: vanishing gradient makes training very deep networks difficult because the error signal attenuates as it passes through many layers. Modern solutions include skip connections (ResNet), batch normalization, and activation functions like ReLU that maintain more stable gradients.

Next Steps in the Series

In the next article we will explore Convolutional Neural Networks (CNNs), the architecture that revolutionized computer vision
We will see how convolutions and pooling extract spatial features from images
We will implement classic architectures like VGG and ResNet in PyTorch