TensorFlow in One Post

How Google's internal brain-training system became the tool that taught the world to build neural networks

Featured image

In 2011, Google had a problem that no one outside Google had. They were training neural networks at a scale no research institution could match — on hundreds of millions of images, billions of words, years of user behavior. The models they built were powering Google Translate, speech recognition in Android, spam detection in Gmail.

They had a system called DistBelief to do it. DistBelief worked. But it was slow to modify, hard to run on new hardware, and deeply tied to Google’s internal infrastructure. If you wanted to train a new type of model, you were fighting the framework.

Jeff Dean, one of Google’s most celebrated engineers, led a team to build something better. The goal: a general-purpose system for building and training neural networks that was fast enough for Google’s scale, flexible enough for research, and designed from the start to run on the new hardware coming from GPU manufacturers.

They called it TensorFlow — named for the mathematical objects at its core, tensors, which flow through a computational graph.

On November 9, 2015, Google open-sourced it.

The research community, which had been using patchwork tools, adopted it immediately. Companies that wanted to do what Google was doing now had a path. Within a year, TensorFlow had more GitHub stars than any other machine learning project. Deep learning, which had been a specialized research discipline, started moving into production everywhere.


What TensorFlow Actually Is

At its core, TensorFlow is a numerical computation library built around one concept: the computational graph.

A computational graph is a map of every mathematical operation in your model. Every addition, multiplication, and activation function becomes a node. The data flowing between nodes — the tensors — are the edges. By representing the entire computation as a graph, TensorFlow can:

This is why Google built it. They needed those properties. But they also made it accessible enough that anyone could use it.


The Keras Layer

For the first few years, writing TensorFlow code was verbose and required understanding the computational graph model explicitly. This was fine for researchers but a barrier for engineers.

In 2017, a separate library called Keras — built by François Chollet at Google — became TensorFlow’s official high-level API. Keras speaks in terms of layers, models, and training loops. It hides the graph entirely for most use cases.

Today, almost everyone who uses TensorFlow uses the Keras API. tf.keras is where you start.


Let’s Build Something

A neural network that classifies handwritten digits — the canonical “hello world” of deep learning.

pip install tensorflow
import tensorflow as tf
from tensorflow import keras
import numpy as np

# Load the MNIST dataset — 60,000 training images of handwritten digits 0-9
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Normalize pixel values from [0, 255] to [0, 1]
x_train = x_train.astype("float32") / 255.0
x_test  = x_test.astype("float32")  / 255.0

# Build the model — a simple three-layer neural network
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),   # 28x28 image -> 784 numbers
    keras.layers.Dense(128, activation="relu"),    # 128 neurons, ReLU activation
    keras.layers.Dropout(0.2),                     # randomly zero 20% to prevent overfitting
    keras.layers.Dense(10, activation="softmax"),  # 10 outputs, one per digit class
])

model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)            Output Shape          Param #   
=================================================================
 flatten (Flatten)       (None, 784)           0         
 dense (Dense)           (None, 128)           100,480   
 dropout (Dropout)       (None, 128)           0         
 dense_1 (Dense)         (None, 10)            1,290     
=================================================================
Total params: 101,770
# Compile — choose optimizer, loss function, and what to measure
model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

# Train — one line
history = model.fit(x_train, y_train, epochs=5, validation_split=0.1, batch_size=128)
Epoch 1/5 — loss: 0.2562, accuracy: 0.9253
Epoch 2/5 — loss: 0.1135, accuracy: 0.9660
Epoch 5/5 — loss: 0.0710, accuracy: 0.9784
# Evaluate on test data the model never saw during training
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f"Test accuracy: {test_acc:.2%}")
# Test accuracy: 97.84%

A neural network that recognizes handwritten digits with 97.8% accuracy, trained in five epochs, in 20 lines of code. That is what the Keras API over TensorFlow looks like.


What Just Happened

Sequential — you are stacking layers in a line, each one feeding into the next. For more complex architectures (multiple inputs, branching paths, skip connections) you would use the Functional API or subclassing.

Flatten — converts the 28×28 pixel grid into a flat array of 784 numbers. The dense layers expect a 1D input.

Dense(128, activation="relu") — a fully connected layer with 128 neurons. ReLU (Rectified Linear Unit) sets negative values to zero — a simple function that helps networks learn complex patterns without the “vanishing gradient” problem that plagued earlier architectures.

Dropout(0.2) — during training, randomly zeros out 20% of the neurons on each pass. This prevents the network from memorizing training data instead of learning the underlying pattern.

Adam — an optimizer that adaptively adjusts learning rates for each weight. You almost always start with Adam.

sparse_categorical_crossentropy — the loss function for multi-class classification where labels are integers (0-9) rather than one-hot encoded vectors.


Saving and Loading

# Save the full model
model.save("digit_classifier.keras")

# Load it back — weights, architecture, optimizer all preserved
loaded_model = keras.models.load_model("digit_classifier.keras")

# Run inference on a single image
image = x_test[0:1]  # shape (1, 28, 28)
predictions = loaded_model.predict(image)
predicted_digit = np.argmax(predictions[0])
print(f"Predicted: {predicted_digit}")

Why This Still Matters

PyTorch has overtaken TensorFlow in research adoption. Most academic papers are now written in PyTorch. Many practitioners find PyTorch more intuitive for experimentation.

But TensorFlow is still dominant in production, particularly inside Google and in organizations that started with it years ago. TensorFlow Serving, TensorFlow Lite (for mobile devices), and TensorFlow.js (for the browser) are deployment targets that have no direct PyTorch equivalent at the same level of maturity.

More importantly for 2026: TensorFlow is how millions of developers learned that neural networks were not black magic. The Keras API — .compile(), .fit(), .evaluate() — brought the same design philosophy as scikit-learn to deep learning. Pick your architecture, declare your loss and optimizer, call .fit(), measure accuracy.

The entire generation of engineers who moved from data science into AI infrastructure did it by running model.fit() for the first time and watching accuracy numbers climb. Most of them did it in TensorFlow.


The One Thing to Remember

TensorFlow took the internal infrastructure that powered Google at scale, simplified it through Keras, and gave every developer a path from raw data to a trained neural network in one afternoon.

Jeff Dean’s team built it to run Google’s AI. The world used it to learn AI. That is still its legacy.


Next in this series: Databricks — how six UC Berkeley researchers who built Apache Spark in a lab created the platform that most enterprise data teams run on today, and what the “lakehouse” architecture actually means for how companies use AI.