NEURAL NETWORKS: THE MACHINE THATRULES OUR LIFE

neural network: the system that turned language into computational commodity

Introduction: The Intelligence That Isn’t

Call it by its real name: artificial intelligence doesn’t exist. What we’ve built isn’t a digital brain that thinks — it’s a statistically monstrous engine for pattern recognition. Yet this mathematical machine is rewriting the social contract between language, power, and truth.

Neural networks aren’t new. They carry seventy years of messianic promises, glacial winters, improbable resurrections. Their trajectory reveals more about our obsession with control than about scientific progress. Because behind every technological breakthrough lies the same question: who decides what the machine learns? And more importantly: what do we learn by watching the machine learn?

This isn’t a technical manual. It’s a map for decoding the apparatus that transformed language into computational commodity.

Genesis: When Neurons Were Still Analog

Warren McCulloch and Walter Pitts publish a paper that reads like science fiction: we can mathematically describe how a biological neuron works. The brain is electricity. Electricity is computation. Computation can be simulated. The promise is clear: replicate human intelligence with paper, pen, and Boolean logic.

Frank Rosenblatt builds the Perceptron in 1958 — the first machine that truly “learns.” It doesn’t follow rigid instructions; it modifies itself based on errors. The press goes wild. The New York Times writes that soon machines will walk, talk, see, reproduce themselves. The future has arrived.

But there’s a problem. A fatal one. In 1969, Marvin Minsky and Seymour Papert mathematically prove that the Perceptron can’t even solve a trivial problem like the XOR function. The miracle machine is a toy. Funding evaporates. The first AI winter arrives — a freeze lasting twenty years.

The lesson is brutal: technological hype doesn’t survive mathematics. But this lesson will be forgotten. Multiple times.

Resurrection: Backpropagation and the Weight Miracle

Geoffrey Hinton, David Rumelhart, and Ronald Williams resurrect a dormant algorithm: error backpropagation. The idea is simple yet powerful. When the neural network makes mistakes, we don’t start over — we send the error backward, layer by layer, adjusting connection weights. The machine learns from its failures.

Suddenly neural networks can have depth. No longer a single layer of neurons, but complex architectures with dozens, hundreds, thousands of stacked levels. Deep Learning becomes theoretically possible. But practically, computers are too slow, datasets too small, memory too expensive.

We need to wait. Another twenty years.

2012: The Moment that Changesìd neural network

AlexNet. Eight layers. Sixty million parameters. Trained on two GPUs for days. Wins ImageNet by margins that humiliate all competitors. This isn’t victory — it’s public execution of traditional computer vision.

What changed? Three things. First: GPUs, designed for video games, prove perfect for the massive parallel computation neural networks require. Second: the Internet generated billions of labeled images — fuel for training computational monsters. Third: algorithms improved, but more importantly, the capacity to brutally parallelize computation improved.

From that moment, every tech company understands that whoever controls the largest neural models controls the future. The arms race begins. Microsoft, Google, Meta, Amazon invest billions. Startups are born and die at the rhythm of funding cycles. AI becomes critical infrastructure. And like all critical infrastructure, it becomes political.

Cyberpunk Milan nightscape visualizing neural networks, data infrastructure, and algorithmic power — Milan at night, as a metaphor: infrastructure, glow, control — the city as a neural system.

Anatomy of Control: How Neural Network works

Imagine an assembly line where each worker examines only an infinitesimal product detail and passes the result to the next. Nobody has the overview. Yet at the chain’s end, an automobile emerges.

Neural networks work this way. They’re organized in layers: input, hidden levels, output. Each neuron receives signals from the previous layer’s neurons, applies a simple calculation — weighted sum plus threshold — and decides whether to activate or stay dormant. The signal proceeds forward like a wave through the structure.

Weights are everything. Every neuron connection has a weight determining how much that signal matters. During training, the network continuously compares its results with correct ones. Every error generates a correction signal that travels back up the chain, modifying weights. It’s a blind, mechanical, statistical process. No comprehension. Only incremental adjustment toward minimum possible error.

The activation function decides whether the neuron “fires” its signal. ReLU, Sigmoid, Tanh: technical names for mathematical thresholds transforming continuous numbers into decisions. On or off. Yes or no. Cat or dog.

“The neural network doesn’t read in the human sense. It performs calculations on constantly converted data.”
— Alfio Ferrara, Le macchine del linguaggio (Einaudi, 2025)

Here’s the point. We don’t “understand” what happens inside because there’s nothing to understand in human terms. There’s only an immense chain of multiplications and additions statistically converging toward recognizable patterns. The neural network doesn’t know what a cat is. But it recognizes which pixel configurations tend to appear together when humans label something “cat.”

It’s a simulacrum of comprehension. Monstrously effective. Dangerously opaque.

Tokenization: Colonizing Language

Machines don’t read words. They read numbers. Before training a language model, we must betray language: fragment it, index it, transform it into geometric coordinates.

Tokenization is the first act of violence. Take a sentence — “The cat runs quickly” — and break it into discrete units called tokens. They don’t always match words. They can be word pieces, suffixes, prefixes. The algorithm chooses based on statistical frequency in the training corpus. Each token receives a unique number: an ID.

But an ID means nothing. It’s just a label. The next step is embedding: transforming that ID into a vector — a list of hundreds of numbers representing the word’s “position” in multidimensional mathematical space.

Something disturbing happens here. In embedding space, semantically similar words end up close together. The machine never studied grammar or semantics: it only learned which words tend to appear in the same contexts. Statistical proximity becomes semantic proximity.

But what happens when the relationships we map are infected with prejudices, stereotypes, power asymmetries? The machine doesn’t filter: it absorbs. Reproduces. Amplifies.

Embedding space is a distorting mirror of the culture that generated it. And that reflection is then used to generate new language, which in turn contaminates future training corpus. It’s a self-reinforcing loop.

Neural network and The Probabilistic Regime: How AI Chooses Words

When a language model answers a question, it isn’t “thinking” about the response. It’s calculating which token has the highest probability of following previous tokens, given everything it saw during training.

The network’s final layer transforms raw scores into a probability distribution. The machine doesn’t choose a word because it understands meaning. It chooses because — given context — that token is statistically most likely to come next.

It’s statistical eloquence. Fluency doesn’t come from comprehension. It comes from scale: billions of examples ingested and digested into number matrices. This means AI cannot be neutral. Definitionally impossible. If the training corpus contains asymmetries, the model reproduces them — not from malice, but from mathematics.

Architectures of Dominance: CNN, RNN, Transformer

CNN: Vision as Surveillance

Convolutional Neural Networks dominate artificial vision. They scan images for local patterns: edges, textures, recurring shapes. The first layers see lines. Later layers see eyes. Deeper layers recognize faces. They’re the technology behind mass facial recognition, automated surveillance, and social sorting.

RNN: Sequence, Memory, Forgetting

Recurrent Neural Networks were designed for sequences: text, audio, time series. They have memory, but they also suffer architectural amnesia: after a limited span, they forget the beginning. For years, this constraint shaped what “AI language” could be.

Transformers: Attention as Power

In 2017, “Attention Is All You Need” changes everything. Transformers introduce attention: instead of processing tokens one at a time, the network sees the entire sequence and decides which parts matter to which others. It can relate the first word to the thousandth without signal loss.

Transformers scale. Add parameters and you get improvements: 100 million, 1 billion, 175 billion, beyond. Each order of magnitude becomes a competitive edge. And here the political problem emerges: only actors with billions of dollars and factory-sized datacenters can train frontier models. Cutting-edge AI becomes an oligopoly. Everyone else becomes a consumer.

LLMs: When Machines Learned to Write

Large Language Models are the apex of contemporary neural engineering: billions of parameters, trillions of processed tokens, trained on practically all available public text — encyclopedias, books, scientific papers, forums, social media, code.

The result: text generation that, in many contexts, is indistinguishable from human writing. Not because the system understands, but because it can predict with frightening precision which token should come next.

LLMs aren’t “chatbots.” They’re linguistic infrastructure. Search, productivity suites, customer service, analysis, content production: the marginal cost of generating text collapses toward zero.

But what does it mean when cultural production can be automated? When the distinction between “written by a human” and “machine-generated” becomes irrelevant? In an attention economy where content is infinite, value shifts: toward curation, verification, context. Or it simply implodes.

The Invisible Cost of neural network: Energy, Bias, Control

AI has an energy price. Training frontier models consumes vast electricity; inference multiplies that cost across billions of daily interactions. Neural networks are becoming an environmental problem on the scale of datacenters — because that’s what they are: datacenters disguised as “intelligence.”

But energy is the visible cost. The invisible cost is epistemic. LLMs learn from corpora reflecting social asymmetries: racism, sexism, classism — everything gets ground into embeddings. The machine doesn’t judge. It reproduces. And when output becomes part of decisions (hiring, loans, sentencing), bias becomes structural.

No technical fix exists for a political problem. Filtering and balancing are power choices. Who decides which voices matter? Which perspectives are “neutral”? Which are “bias”? The companies building these systems become de facto cultural arbiters — without elections, without public accountability.

Conclusion: The Broken Mirror

Neural networks haven’t revolutionized our lives because they’re intelligent. They’ve revolutionized them because they’re efficient. They industrialized tasks we believed exclusively human: recognizing faces, translating languages, writing texts, composing music.

But what have we learned in the process? That intelligence might not be the metaphysical specter we imagined. It’s large-scale pattern matching. Robust statistics. Sophisticated interpolation. And if this is true for machines, perhaps it’s true for us too.

The neural network is a mirror. It shows us the latent structure of our language, our prejudices, our obsessions. It forces us to confront questions we’d prefer to avoid. If a machine can imitate human creativity, what really was that creativity? If an algorithm can predict our choices, how free really were those choices?

The question isn’t whether machines will become intelligent. The question is: were we ever really intelligent?