Hi! I'm

Federico Calò

Software Developer | Technical Writer

I create modern web applications and custom digital tools to help businesses grow through technological innovation. My passion is combining computer science and economics to generate real value.

Contact Me

About Me

My passion for computer science was born at the Technical Commercial Institute of Maglie, where I discovered the power of programming and the fascination of creating digital solutions. From the start, I understood that computer science was not just code, but an extraordinary tool for turning ideas into reality.

During my studies in Business Information Systems, I began to interweave computer science and economics, understanding how technology can be the engine of growth for any business. This vision accompanied me to the University of Bari, where I obtained my degree in Computer Science, deepening my technical skills and passion for software development.

Today I put this experience at the service of businesses, professionals and startups, creating tailor-made digital solutions that automate processes, optimize resources and open new business opportunities. Because true innovation begins when technology meets the real needs of people.

My Skills

Data Analysis & Predictive Models

I transform data into strategic insights with in-depth analysis and predictive models for informed decisions

Process Automation

I create custom tools that automate repetitive operations and free up time for value-added activities

Custom Systems

I develop tailor-made software systems, from platform integrations to customized dashboards

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Che tu gestisca un negozio, uno studio professionale o un'azienda, posso aiutarti a sfruttare le potenzialità dell'informatica per lavorare meglio, più velocemente e in modo più intelligente.

Parliamone Insieme →

Join the Community

Join the developer community where we discuss software, AI, architecture and DevOps. Share ideas, ask questions and grow with us.

Channel

FC Dev Blog

Get notifications on new articles, complete series, weekly tips and featured tools. Bilingual IT/EN content directly in your Telegram.

New articles as they are published
Weekly tips and code snippets
Polls on future topics

Subscribe to Channel

Group

FC Dev Community

A bilingual IT/EN community for developers. Discussions, Q&A, mutual help and networking with other professionals.

Discussions on articles and technologies
Coding help and code review
Job opportunities and collaboration

Join the Group

Discussion Topics

View

Master SQL

RoadMap.sh

November 2024

View

Oracle Certified Foundations Associate

Oracle

October 2024

View

People Leadership Credential

Connect

September 2024

💻 Languages & Technologies

Java

Python

JavaScript

Angular

React

TypeScript

SQL

PHP

CSS/SCSS

Node.js

Docker

Git

💼

12/2024 - Present

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italy · Hybrid Analysis and development of computer systems through the use of Java and Quarkus in Health and Public Sector. Continuous training on modern technologies for creating customized and efficient software solutions and on agents.

💼

06/2022 - 12/2024

Software analyst and Back End Developer Associate Consultant

Links Management and Technology SpA

Experience analyzing as-is software systems and ETL flows using PowerCenter. Completed Spring Boot training for developing modern and scalable backend applications. Backend developer specialized in Spring Boot, with experience in database design, analysis, development and testing of assigned tasks.

💼

02/2021 - 10/2021

Software programmer

Adesso.it (prima era WebScience srl)

Experience in AS-IS and TO-BE analysis, SEO evolutions and website evolutions to improve user performance and engagement.

🎓

2018 - 2025

Degree in Computer Science

University of Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Corporate Information Systems

Technical Commercial Institute of Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

Introduction: Neural Networks for Sequences

Recurrent Neural Networks (RNNs) are designed to process sequential data: text, time series, audio, action sequences. Unlike feedforward networks that process independent inputs, RNNs maintain a hidden state that acts as memory, allowing the network to consider the context of previous information in the sequence.

However, classic RNNs suffer from the vanishing gradient problem: during training, the gradient attenuates rapidly across time steps, making it impossible to learn long-term dependencies. LSTMs (Long Short-Term Memory) and GRUs (Gated Recurrent Units) solve this problem with sophisticated gating mechanisms.

What You Will Learn

How RNNs maintain state across sequences
The vanishing gradient problem and why it limits RNNs
LSTM: input gate, forget gate, output gate, and cell state
GRU: a lightweight alternative to LSTM
Bidirectional RNNs and sequence-to-sequence models
Practical implementation: text generation and sentiment analysis

RNN: Architecture and Hidden State

An RNN processes a sequence one element at a time, updating its hidden state at each step. This state vector captures a compressed summary of all information seen up to that point. The output at each time step depends on both the current input and the previous hidden state.

Formally, at each time step t the RNN computes:

h_t = tanh(W_hh * h_(t-1) + W_xh * x_t + b_h): new hidden state combining previous state and current input
y_t = W_hy * h_t + b_y: output at the current time step


import torch
import torch.nn as nn

# Simple RNN in PyTorch
class SimpleRNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.hidden_size = hidden_size
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        # x shape: (batch, seq_len, input_size)
        # h0 shape: (1, batch, hidden_size)
        h0 = torch.zeros(1, x.size(0), self.hidden_size).to(x.device)
        output, hidden = self.rnn(x, h0)
        # Use the last hidden state for classification
        out = self.fc(hidden.squeeze(0))
        return out

# Example: sequence of 20 time steps with 10 features each
model = SimpleRNN(input_size=10, hidden_size=64, output_size=2)
x = torch.randn(8, 20, 10)  # batch=8, seq=20, features=10
output = model(x)
print(f"Output: {output.shape}")  # [8, 2]

The Vanishing Gradient Problem

The vanishing gradient is the Achilles heel of classic RNNs. During backpropagation through time (BPTT), the gradient is repeatedly multiplied by the weight matrix W_hh at each time step. If the eigenvalues of this matrix are less than 1, the gradient decreases exponentially; if greater than 1, it explodes.

In practice, this means a classic RNN cannot learn dependencies that span more than 10-20 time steps. If the key word to understand the sentiment of a sentence is at the beginning and the output is at the end, the gradient will have nearly vanished before reaching that word.

Why LSTMs Are Needed

LSTMs solve vanishing gradient with an elegant insight: instead of forcing all information through repeated multiplications, they add a separate cell state that serves as an information "highway". Gates control which information to add, forget, or read from the cell state, allowing the gradient to flow unchanged across hundreds of time steps.

LSTM: Long Short-Term Memory

LSTMs, introduced by Hochreiter and Schmidhuber in 1997, solve vanishing gradient with four key components:

The Three Gates

Forget Gate (f_t): decides which information to discard from the cell state. A sigmoid value between 0 (forget everything) and 1 (keep everything) for each dimension
Input Gate (i_t): decides which new information to add to the cell state. Combines a sigmoid gate (how much to add) with a tanh candidate vector (what to add)
Output Gate (o_t): decides which part of the cell state to use as output/hidden state. Filters the cell state through tanh and sigmoid

Cell State

The cell state is the heart of the LSTM. It flows through the temporal chain with only linear operations (multiplication and addition), allowing the gradient to propagate easily. The gates regulate the flow of information into and out of the cell state.


import torch
import torch.nn as nn

class LSTMClassifier(nn.Module):
    """LSTM for sequence classification"""
    def __init__(self, vocab_size, embed_dim, hidden_size,
                 num_layers, num_classes, dropout=0.3):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embed_dim, padding_idx=0)
        self.lstm = nn.LSTM(
            input_size=embed_dim,
            hidden_size=hidden_size,
            num_layers=num_layers,
            batch_first=True,
            dropout=dropout if num_layers > 1 else 0,
            bidirectional=True
        )
        self.dropout = nn.Dropout(dropout)
        # Bidirectional: hidden_size * 2
        self.fc = nn.Linear(hidden_size * 2, num_classes)

    def forward(self, x):
        embedded = self.dropout(self.embedding(x))
        lstm_out, (hidden, cell) = self.lstm(embedded)

        # Concatenate forward and backward hidden states
        hidden_cat = torch.cat((hidden[-2], hidden[-1]), dim=1)
        output = self.fc(self.dropout(hidden_cat))
        return output

# Sentiment analysis: vocab 10000, embedding 128, hidden 256
model = LSTMClassifier(
    vocab_size=10000,
    embed_dim=128,
    hidden_size=256,
    num_layers=2,
    num_classes=2  # positive/negative
)

# Input: batch of 16 sentences, max 50 tokens
x = torch.randint(0, 10000, (16, 50))
output = model(x)
print(f"Output: {output.shape}")  # [16, 2]

GRU: A Lighter Alternative

GRUs (Gated Recurrent Units), introduced by Cho et al. in 2014, are a simplified version of LSTMs. They combine the forget and input gates into a single update gate and merge cell state with hidden state, reducing the number of parameters by approximately 25%.

GRUs have two gates:

Reset Gate (r_t): how much of the old hidden state to ignore when computing the new candidate
Update Gate (z_t): how much of the old hidden state to keep vs how much of the new candidate to use

In practice, GRUs achieve comparable performance to LSTMs on many tasks with shorter training times. The choice depends on the task: for very long sequences LSTMs tend to be superior, for smaller datasets GRUs may be preferable due to their lower tendency to overfit.

Bidirectional RNNs and Sequence-to-Sequence

Bidirectional

A bidirectional RNN processes the sequence both forward (left to right) and backward (right to left), concatenating the two hidden states. This allows each position to have context from both the past and the future, which is fundamental for tasks like Named Entity Recognition where a word's meaning depends on the full context.

Sequence-to-Sequence (Seq2Seq)

The Seq2Seq architecture uses an encoder RNN to compress the input sequence into a fixed-size context vector, and a decoder RNN to generate the output sequence. This architecture was fundamental for machine translation before the advent of Transformers.


class TextGenerator(nn.Module):
    """Character-by-character text generator"""
    def __init__(self, vocab_size, embed_dim, hidden_size, num_layers):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embed_dim)
        self.lstm = nn.LSTM(embed_dim, hidden_size, num_layers,
                           batch_first=True, dropout=0.2)
        self.fc = nn.Linear(hidden_size, vocab_size)

    def forward(self, x, hidden=None):
        embedded = self.embedding(x)
        output, hidden = self.lstm(embedded, hidden)
        logits = self.fc(output)
        return logits, hidden

    def generate(self, start_token, max_len=100, temperature=0.8):
        """Generate text auto-regressively"""
        self.eval()
        current = start_token.unsqueeze(0).unsqueeze(0)
        hidden = None
        generated = [start_token.item()]

        with torch.no_grad():
            for _ in range(max_len):
                logits, hidden = self(current, hidden)
                logits = logits[:, -1, :] / temperature
                probs = torch.softmax(logits, dim=-1)
                next_token = torch.multinomial(probs, 1)
                generated.append(next_token.item())
                current = next_token

        return generated

The Advent of Attention

The main limitation of the Seq2Seq model is the bottleneck: all information from the input sequence is compressed into a single fixed vector. For long sequences, this vector fails to capture all the details. The Attention mechanism, introduced by Bahdanau et al. in 2014, solves this problem by allowing the decoder to "look" directly at all encoder positions. This idea led to Transformers, which we will explore in the next article.

Practical Applications

RNNs and LSTMs find applications in numerous domains:

Sentiment Analysis: classifying the sentiment of reviews, tweets, comments. Bidirectional LSTMs capture complete context
Time Series Forecasting: predicting stock prices, energy consumption, system metrics. LSTMs excel at capturing seasonal patterns
Text Generation: generating text character by character or word by word, from chatbots to computational poetry
Machine Translation: automatic translation with Seq2Seq + Attention architecture (predecessor of Transformers)
Speech Recognition: audio-to-text conversion, where acoustic sequences are mapped to phoneme and word sequences

Next Steps in the Series

In the next article we will explore Transformers, the architecture that made RNNs obsolete for most NLP tasks
We will cover self-attention, multi-head attention, and positional encoding
We will analyze BERT and GPT: how they revolutionized Natural Language Processing