MongoDB, Redis, DynamoDB β CRUD, data modeling, indexing, caching patterns & when to use what
Database / NoSQLNoSQL (Not Only SQL) databases are designed for specific data models and have flexible schemas. They excel at scalability, performance, and handling unstructured or semi-structured data.
| Type | Description | Examples | Best For |
|---|---|---|---|
| Document | JSON/BSON documents with nested structures | MongoDB, CouchDB, Firestore | Content management, catalogs, user profiles |
| Key-Value | Simple key β value pairs, extremely fast | Redis, DynamoDB, etcd | Caching, sessions, config, rate limiting |
| Column-Family | Data stored in column families, wide rows | Cassandra, HBase, ScyllaDB | Time-series, IoT, analytics at scale |
| Graph | Nodes + edges with properties | Neo4j, Amazon Neptune | Social networks, recommendations, fraud detection |
| Search Engine | Full-text search with inverted indexes | Elasticsearch, Solr | Log analysis, search, autocomplete |
| Aspect | SQL (PostgreSQL, MySQL) | NoSQL (MongoDB, Redis) |
|---|---|---|
| Schema | Fixed schema, migrations needed | Flexible / schema-less |
| Query Language | SQL (standardized) | Database-specific APIs |
| Relationships | JOINs β normalized data | Embedded / denormalized data |
| Scaling | Vertical (scale up) | Horizontal (scale out / sharding) |
| Transactions | Strong ACID | Varies (eventual to strong consistency) |
| Data Model | Tables with rows & columns | Documents, key-value, columns, graphs |
| Best For | Complex queries, relationships, reporting | High scale, flexible data, specific access patterns |
MongoDB stores data as BSON documents (binary JSON) in collections (β tables).
| SQL | MongoDB |
|---|---|
| Database | Database |
| Table | Collection |
| Row | Document |
| Column | Field |
| Primary Key | _id (auto-generated ObjectId) |
| JOIN | $lookup (aggregation) or embed |
| Index | Index |
// Connect via mongosh
mongosh "mongodb://localhost:27017/mydb"
// Basic shell commands
show dbs // list databases
use mydb // switch database
show collections // list collections
db.users.stats() // collection statistics
db.dropDatabase() // drop current database// ββ INSERT ββ
db.users.insertOne({
name: "Alice",
email: "alice@example.com",
age: 30,
tags: ["admin", "developer"],
address: { city: "NYC", zip: "10001" }
});
db.users.insertMany([
{ name: "Bob", age: 25 },
{ name: "Charlie", age: 35 }
]);
// ββ READ ββ
db.users.find({}); // all documents
db.users.findOne({ email: "alice@example.com" });
db.users.find({ age: { $gte: 25 } }); // age >= 25
db.users.find({}, { name: 1, email: 1 }); // projection (only name & email)
db.users.find({}).sort({ age: -1 }).limit(10);
// ββ UPDATE ββ
db.users.updateOne(
{ email: "alice@example.com" },
{ $set: { age: 31, role: "admin" } }
);
db.users.updateMany(
{ age: { $lt: 18 } },
{ $set: { status: "minor" } }
);
// Upsert (insert if not found)
db.users.updateOne(
{ email: "new@user.com" },
{ $set: { name: "New User" } },
{ upsert: true }
);
// ββ DELETE ββ
db.users.deleteOne({ email: "bob@example.com" });
db.users.deleteMany({ age: { $lt: 18 } });
db.users.drop(); // delete entire collection| Operator | Description | Example |
|---|---|---|
$eq | Equal | { age: { $eq: 25 } } |
$ne | Not equal | { status: { $ne: "inactive" } } |
$gt / $gte | Greater than / >= | { age: { $gte: 18 } } |
$lt / $lte | Less than / <= | { price: { $lt: 100 } } |
$in | In array | { role: { $in: ["admin", "mod"] } } |
$nin | Not in array | { role: { $nin: ["banned"] } } |
// AND (implicit)
db.users.find({ age: { $gte: 18 }, role: "admin" });
// OR
db.users.find({ $or: [{ age: { $lt: 18 } }, { role: "minor" }] });
// NOT
db.users.find({ age: { $not: { $lt: 18 } } });
// Array queries
db.users.find({ tags: "admin" }); // has "admin" in array
db.users.find({ tags: { $all: ["admin", "dev"] } }); // has both
db.users.find({ tags: { $size: 3 } }); // array length = 3
db.users.find({ "tags.0": "admin" }); // first element
// Nested document query
db.users.find({ "address.city": "NYC" });
// Exists
db.users.find({ phone: { $exists: true } });
// Regex
db.users.find({ name: { $regex: /^ali/i } });The aggregation pipeline processes documents through a sequence of stages β similar to Unix pipes.
db.orders.aggregate([
// Stage 1: Filter
{ $match: { status: "completed" } },
// Stage 2: Group by user and calculate stats
{ $group: {
_id: "$user_id",
totalSpent: { $sum: "$total" },
orderCount: { $sum: 1 },
avgOrder: { $avg: "$total" }
}},
// Stage 3: Sort by total spent descending
{ $sort: { totalSpent: -1 } },
// Stage 4: Limit to top 10
{ $limit: 10 },
// Stage 5: Join with users collection
{ $lookup: {
from: "users",
localField: "_id",
foreignField: "_id",
as: "user"
}},
// Stage 6: Reshape output
{ $project: {
userName: { $arrayElemAt: ["$user.name", 0] },
totalSpent: 1,
orderCount: 1,
avgOrder: { $round: ["$avgOrder", 2] }
}}
]);| Stage | Purpose |
|---|---|
$match | Filter documents (like WHERE) |
$group | Group by field, apply accumulators ($sum, $avg, $min, $max, $push) |
$sort | Sort documents |
$project | Reshape: include/exclude/compute fields |
$lookup | Left outer join with another collection |
$unwind | Deconstruct array field β one doc per element |
$limit / $skip | Pagination |
$addFields | Add computed fields |
$facet | Run multiple pipelines in parallel on same input |
// Single field index
db.users.createIndex({ email: 1 }); // ascending
db.users.createIndex({ email: 1 }, { unique: true });
// Compound index
db.orders.createIndex({ user_id: 1, created_at: -1 });
// Text index (full-text search)
db.articles.createIndex({ title: "text", body: "text" });
db.articles.find({ $text: { $search: "mongodb nosql" } });
// TTL index (auto-delete after expiry)
db.sessions.createIndex({ createdAt: 1 }, { expireAfterSeconds: 3600 });
// Partial index
db.users.createIndex(
{ email: 1 },
{ partialFilterExpression: { status: "active" } }
);
// View indexes
db.users.getIndexes();
// Explain query (check if index is used)
db.users.find({ email: "a@b.com" }).explain("executionStats");$lookup for joins// ββ Embedded pattern (blog post with comments) ββ
{
_id: ObjectId("..."),
title: "MongoDB Guide",
author: "Alice",
comments: [
{ user: "Bob", text: "Great post!", date: ISODate("2024-01-15") },
{ user: "Charlie", text: "Very helpful", date: ISODate("2024-01-16") }
]
}
// ββ Referenced pattern (users and orders) ββ
// users collection
{ _id: ObjectId("u1"), name: "Alice", email: "alice@ex.com" }
// orders collection
{ _id: ObjectId("o1"), user_id: ObjectId("u1"), total: 99.99 }| Pattern | Description | Use Case |
|---|---|---|
| Bucket | Group related data into buckets (e.g., by time) | IoT data, time-series |
| Computed | Pre-compute aggregations on write | Dashboards, leaderboards |
| Subset | Store frequent fields in main doc, rest in separate | Product listings (summary + details) |
| Polymorphic | Different doc structures in same collection | Mixed content types |
Redis is an in-memory key-value data store. It's used for caching, sessions, real-time analytics, queues, pub/sub, and more. Extremely fast β supports ~100k+ ops/sec.
# Connect
redis-cli -h localhost -p 6379
# Basic string operations
SET user:1:name "Alice"
GET user:1:name # β "Alice"
SET session:abc123 "data" EX 3600 # expires in 1 hour
TTL session:abc123 # seconds remaining
DEL user:1:name # delete key
EXISTS user:1:name # 1 if exists, 0 if not
# Atomic increment (perfect for counters)
INCR page:views # increment by 1
INCRBY page:views 10 # increment by 10
# Multiple operations
MSET k1 "v1" k2 "v2" k3 "v3"
MGET k1 k2 k3
# Key patterns
KEYS user:* # find matching keys (DON'T use in production!)
SCAN 0 MATCH user:* COUNT 100 # safe cursor-based iteration# ββ HASH (like a mini document) ββ
HSET user:1 name "Alice" age 30 email "alice@ex.com"
HGET user:1 name # β "Alice"
HGETALL user:1 # β all fields & values
HINCRBY user:1 age 1 # β 31
# ββ LIST (ordered, duplicates allowed) ββ
LPUSH queue:tasks "task1" "task2" # push left
RPUSH queue:tasks "task3" # push right
LPOP queue:tasks # pop from left
RPOP queue:tasks # pop from right
LRANGE queue:tasks 0 -1 # all elements
LLEN queue:tasks # length
# ββ SET (unique, unordered) ββ
SADD tags:article:1 "redis" "nosql" "database"
SMEMBERS tags:article:1 # all members
SISMEMBER tags:article:1 "redis" # 1 if member, 0 if not
SINTER tags:article:1 tags:article:2 # intersection
# ββ SORTED SET (unique, scored & ordered) ββ
ZADD leaderboard 100 "Alice" 85 "Bob" 92 "Charlie"
ZRANGE leaderboard 0 -1 WITHSCORES # ascending
ZREVRANGE leaderboard 0 2 # top 3 descending
ZRANK leaderboard "Alice" # rank (0-based)
ZINCRBY leaderboard 5 "Alice" # increment score| Type | Use Case |
|---|---|
| String | Caching, counters, session tokens |
| Hash | User profiles, object storage |
| List | Message queues, activity feeds, recent items |
| Set | Tags, unique visitors, social connections |
| Sorted Set | Leaderboards, priority queues, rate limiters |
| Stream | Event sourcing, log processing |
# Pseudocode: check cache first, then DB
def get_user(user_id):
# 1. Check cache
cached = redis.get(f"user:{user_id}")
if cached:
return json.loads(cached)
# 2. Cache miss β fetch from DB
user = db.query("SELECT * FROM users WHERE id = %s", user_id)
# 3. Store in cache with TTL
redis.setex(f"user:{user_id}", 3600, json.dumps(user))
return user# Publisher
PUBLISH notifications "New order received!"
# Subscriber (in another terminal)
SUBSCRIBE notifications
# Blocks and receives messages in real-time# Using sorted set for sliding window rate limiter
def is_rate_limited(user_id, limit=100, window=60):
key = f"ratelimit:{user_id}"
now = time.time()
pipe = redis.pipeline()
pipe.zremrangebyscore(key, 0, now - window) # remove old entries
pipe.zadd(key, {str(now): now}) # add current request
pipe.zcard(key) # count requests
pipe.expire(key, window) # set TTL
_, _, count, _ = pipe.execute()
return count > limit| Option | How | Trade-off |
|---|---|---|
| RDB | Point-in-time snapshots at intervals | Fast recovery, may lose recent data |
| AOF | Append every write operation to log | More durable, larger file, slower recovery |
| RDB + AOF | Both combined | Best durability, recommended for production |
AWS managed NoSQL key-value and document database. Designed for single-digit millisecond latency at any scale.
| Concept | Description |
|---|---|
| Table | Collection of items (similar to a SQL table) |
| Item | A single record (like a row) |
| Partition Key (PK) | Primary key β determines which partition stores the item |
| Sort Key (SK) | Optional β allows multiple items per partition key (range queries) |
| GSI | Global Secondary Index β different PK/SK for alternative access |
| LSI | Local Secondary Index β same PK, different SK |
# DynamoDB is access-pattern-driven!
# Design your table around HOW you'll query it
# Example: E-commerce orders
# PK: USER#alice SK: ORDER#2024-01-15#001
# PK: USER#alice SK: PROFILE
# PK: USER#alice SK: ADDRESS#home
# Single-table design: multiple entity types in one table
# Enables fetching related data in a single query
# Query: Get all orders for a user
table.query(
KeyConditionExpression="PK = :pk AND begins_with(SK, :prefix)",
ExpressionAttributeValues={
":pk": "USER#alice",
":prefix": "ORDER#"
}
)| Requirement | Best Choice | Why |
|---|---|---|
| Complex queries, relationships | PostgreSQL | Rich SQL, JOINs, ACID |
| Flexible documents, rapid iteration | MongoDB | Schema-less, easy to start |
| Caching, sessions, real-time | Redis | In-memory speed, rich data types |
| Massive scale, simple key access | DynamoDB | Managed, auto-scaling, predictable |
| Time-series, high write throughput | Cassandra | Distributed writes, linear scaling |
| Full-text search, analytics | Elasticsearch | Inverted indexes, aggregation |
| Social graphs, relationships | Neo4j | Native graph traversal |