← Back to Part I
2

Chapter Two

Data Models and Query Languages

The most visible distinguishing factor between databases from a developer's point of view is the data model: how the data is organized and how you can query it.

Interactive: Data Model Explorer

Explore the three major data model paradigms. Click each model to see its structure, advantages, and best use cases.

📊

Relational Model

Data is organized into tables (relations) with rows and columns. Relationships are expressed through foreign keys.

Advantages

  • Schema enforcement ensures data consistency
  • Joins allow flexible data relationships
  • ACID transactions provide reliability
  • Mature ecosystem and tooling

Considerations

  • Schema migrations can be complex
  • Object-relational impedance mismatch
  • Complex joins can impact performance

Example Structure

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  name VARCHAR(100),
  email VARCHAR(255) UNIQUE
);

CREATE TABLE posts (
  id SERIAL PRIMARY KEY,
  user_id INTEGER REFERENCES users(id),
  title VARCHAR(255),
  content TEXT
);

Best Use Case

Financial systems, ERPs, applications with complex relationships and strict consistency requirements

Relational vs Document Model

The Impedance Mismatch

Most application development is done in object-oriented programming languages, where data is represented as objects and references. The relational model, conversely, represents data as relations and foreign keys. This mismatch creates friction when translating between the two representations.

Application Layer

Objects with references, nested structures, and flexible schemas

Relational Database

Tables with rows, columns, foreign keys, and rigid schemas

Choosing Between Models

1

Data Relationships

For highly interconnected data with many-to-many relationships, relational or graph models shine. For mostly independent documents with few relationships, document databases simplify access patterns.

2

Schema Evolution

Document databases handle schema changes gracefully—different documents can have different fields. Relational databases require migrations, but provide stronger guarantees about data structure.

3

Access Patterns

If you typically need entire documents at once (e.g., a user profile with all their settings), document storage is efficient. If you need to query and aggregate across many records, relational databases excel.

Query Languages

Data is useless if you can't find, manipulate, and analyze it. Query languages are the interface between humans (or applications) and the stored data.

SQL vs NoSQL Query Comparison

Same query—finding all posts by a user named "Alice"—expressed in different languages.

SQL

Relational

SELECT p.title, p.content
FROM posts p
JOIN users u ON p.user_id = u.id
WHERE u.name = 'Alice';
NoSQL

Document (MongoDB)

db.users.findOne(
  { name: "Alice" }
).posts

Declarative vs Imperative Querying

Modern query languages (like SQL) are typically declarative: you specify what you want, not how to get it. This allows the database engine to optimize the execution plan.

Declarative (SQL)

"Give me all users who joined after January 2024" — The database decides how to retrieve this data efficiently.

Imperative (IMS, CODASYL)

"Follow pointer A to record X, then follow pointer B to record Y" — The programmer must know the data structure and traversal path.

Many-to-Many Relationships

Different data models handle relationships between entities differently. This is especially important for many-to-many relationships, which are common in real-world applications.

Relationship Models Compared

R

Relational: Join Table

Requires an intermediate table to link two entities:

CREATE TABLE person_organization (
  person_id INTEGER REFERENCES persons(id),
  org_id INTEGER REFERENCES organizations(id),
  role VARCHAR(50),
  PRIMARY KEY (person_id, org_id)
);
D

Document: Array of References

Store an array of references in one or both documents:

{
  "_id": "person123",
  "name": "Alice",
  "organizations": [
    { "id": "org1", "role": "Engineer" },
    { "id": "org2", "role": "Consultant" }
  ]
}

⚠️ Document databases have limitations here—querying "reverse" relationships efficiently can be challenging.

G

Graph: First-Class Edges

Relationships are first-class citizens with their own properties:

// Create relationship
CREATE (alice:Person {name: "Alice"})
CREATE (acme:Organization {name: "Acme Corp"})
CREATE (alice)-[:MEMBER_OF {role: "Engineer"}]->(acme)

// Query with relationship properties
MATCH (p:Person)-[r:MEMBER_OF]->(o:Organization)
WHERE p.name = "Alice"
RETURN p.name, o.name, r.role

✓ Graph models handle many-to-many relationships naturally.

Choosing Based on Relationship Complexity

Few Relationships

Document model works well when relationships are simple and one-directional

Moderate Complexity

Relational model with join tables handles moderate complexity reliably

Highly Connected

Graph model excels when traversing deep, interconnected relationships

Graph-Like Data Models

When your application's data naturally forms a graph—social networks, road networks, knowledge graphs, or recommendation engines—a graph database can dramatically simplify your code and improve performance.

When to Use Graph Models

👥

Social Networks

Friends, followers, connections—naturally graph-structured

🗺️

Route Planning

Roads, intersections, distances—perfect for graph algorithms

🔗

Knowledge Graphs

Concepts, entities, and their relationships—semantic networks

🎯

Recommendations

"People who bought X also bought Y"—graph-based collaborative filtering

Power of Graph Queries

Graph databases make complex multi-hop queries simple and efficient.

Query: "Friends of friends who live in San Francisco"

MATCH (me:Person {name: "Alice"})-[:FRIENDS_WITH]->(friend)-[:FRIENDS_WITH]->(fof:Person)
WHERE fof.city = "San Francisco"
RETURN DISTINCT fof.name

Query: "Shortest path between two people"

MATCH path = shortestPath(
  (alice:Person {name: "Alice"})-[*]-(bob:Person {name: "Bob"})
)
RETURN [n IN nodes(path) | n.name] AS connection_chain

Key insight: In a relational database, these queries require multiple self-joins whose depth must be known in advance. Graph databases handle arbitrary-depth traversals naturally.