Files
grateful-journal/docs/SCHEMA.md

13 KiB

Grateful Journal — MongoDB Schema Documentation

Version: 2.0 (Refactored)
Last Updated: 2026-03-05


Overview

This document describes the refactored MongoDB schema for the Grateful Journal application. The schema has been redesigned to:

  • Ensure one user per email (deduplicated)
  • Use ObjectId references instead of strings
  • Optimize queries for common operations (history pagination, calendar view)
  • Prepare for client-side encryption
  • Add proper indexes for performance

Collections

1. users Collection

Stores user profile information. One document per unique email.

Schema

{
  _id: ObjectId,
  email: string (unique),
  displayName: string,
  photoURL: string,
  theme: "light" | "dark",
  createdAt: Date,
  updatedAt: Date
}

Field Descriptions

Field Type Required Notes
_id ObjectId Yes Unique primary key, auto-generated
email String Yes User's email; unique constraint; indexed
displayName String Yes User's display name (from Google Auth)
photoURL String No User's profile photo URL
theme String Yes Theme preference: "light" or "dark"
createdAt Date Yes Account creation timestamp
updatedAt Date Yes Last profile update timestamp

Unique Constraints

  • email: Unique index ensures one user per email address

Example Document

{
  "_id": ObjectId("507f1f77bcf86cd799439011"),
  "email": "jeet.debnath2004@gmail.com",
  "displayName": "Jeet Debnath",
  "photoURL": "https://lh3.googleusercontent.com/a/ACg8...",
  "theme": "light",
  "createdAt": ISODate("2026-03-04T06:51:32.598Z"),
  "updatedAt": ISODate("2026-03-05T10:30:00.000Z")
}

2. entries Collection

Stores journal entries for each user. Each entry has a logical journal date and optional encryption metadata.

Schema

{
  _id: ObjectId,
  userId: ObjectId,
  title: string,
  content: string,
  mood: "happy" | "sad" | "neutral" | "anxious" | "grateful" | null,
  tags: string[],
  isPublic: boolean,

  entryDate: Date,        // Logical journal date
  createdAt: Date,
  updatedAt: Date,

  encryption: {
    encrypted: boolean,
    iv: string | null,      // Base64-encoded initialization vector
    algorithm: string | null // e.g., "AES-256-GCM"
  }
}

Field Descriptions

Field Type Required Notes
_id ObjectId Yes Entry ID; auto-generated; indexed
userId ObjectId Yes Reference to user._id; indexed; enforced
title String Yes Entry title/headline
content String Yes Entry body content
mood String No Mood selector (null if not set)
tags Array Yes Array of user-defined tags []
isPublic Bool Yes Public sharing flag (currently unused)
entryDate Date Yes Logical journal date (start of day, UTC)
createdAt Date Yes Database write timestamp
updatedAt Date Yes Last modification timestamp
encryption Object Yes Encryption metadata (nested)

Encryption Metadata

{
  encrypted: boolean,       // If true, content is encrypted
  iv: string | null,        // Base64 initialization vector
  algorithm: string | null  // Encryption algorithm name
}

Notes:

  • encrypted: false by default (plain text storage)
  • When setting encrypted: true, client provides iv and algorithm
  • Server stores metadata but does NOT decrypt; decryption happens client-side

Example Document

{
  "_id": ObjectId("507f1f77bcf86cd799439012"),
  "userId": ObjectId("507f1f77bcf86cd799439011"),
  "title": "Today's Gratitude",
  "content": "I'm grateful for my family, coffee, and a good day at work.",
  "mood": "grateful",
  "tags": ["family", "work", "coffee"],
  "isPublic": false,
  "entryDate": ISODate("2026-03-05T00:00:00.000Z"),
  "createdAt": ISODate("2026-03-05T12:30:15.123Z"),
  "updatedAt": ISODate("2026-03-05T12:30:15.123Z"),
  "encryption": {
    "encrypted": false,
    "iv": null,
    "algorithm": null
  }
}

Indexes

Indexes optimize query performance. All indexes are created by the scripts/create_indexes.py script.

Users Indexes

// Unique index on email (prevents duplicates)
db.users.createIndex({ email: 1 }, { unique: true });

// For sorting users by creation date
db.users.createIndex({ createdAt: -1 });

Entries Indexes

// Compound index for history pagination (most recent first)
db.entries.createIndex({ userId: 1, createdAt: -1 });

// Compound index for calendar queries by date
db.entries.createIndex({ userId: 1, entryDate: 1 });

// For tag-based searches (future feature)
db.entries.createIndex({ tags: 1 });

// For sorting by entry date
db.entries.createIndex({ entryDate: -1 });

Index Rationale

  • (userId, createdAt): Supports retrieving a user's entries in reverse chronological order with pagination
  • (userId, entryDate): Supports calendar view queries (entries for a specific month/date)
  • tags: Supports future tag filtering/search
  • entryDate: Supports standalone date-range queries

Query Patterns

User Queries

Find or Create User (Upsert)

db.users.update_one(
    { "email": email },
    {
        "$setOnInsert": {
            "email": email,
            "displayName": displayName,
            "photoURL": photoURL,
            "theme": "light",
            "createdAt": datetime.utcnow()
        },
        "$set": {
            "updatedAt": datetime.utcnow()
        }
    },
    upsert=True
)

Why: Ensures exactly one user per email. Frontend calls this after any Firebase login.

Get User by Email

user = db.users.find_one({ "email": email })

Index Used: Unique index on email


Entry Queries

Create Entry

db.entries.insert_one({
    "userId": ObjectId(user_id),
    "title": title,
    "content": content,
    "mood": mood,
    "tags": tags,
    "isPublic": False,
    "entryDate": entry_date,  # Start of day UTC
    "createdAt": datetime.utcnow(),
    "updatedAt": datetime.utcnow(),
    "encryption": {
        "encrypted": False,
        "iv": None,
        "algorithm": None
    }
})

Get Entries for User (Paginated, Recent First)

entries = db.entries.find(
    { "userId": ObjectId(user_id) }
).sort("createdAt", -1).skip(skip).limit(limit)

Index Used: (userId, createdAt)
Use Case: History page with pagination

Get Entries by Month (Calendar View)

start_date = datetime(year, month, 1)
end_date = datetime(year, month + 1, 1)

entries = db.entries.find({
    "userId": ObjectId(user_id),
    "entryDate": {
        "$gte": start_date,
        "$lt": end_date
    }
}).sort("entryDate", -1)

Index Used: (userId, entryDate)
Use Case: Calendar view showing entries for a specific month

Get Entry for Specific Date

target_date = datetime(year, month, day)
next_date = target_date + timedelta(days=1)

entries = db.entries.find({
    "userId": ObjectId(user_id),
    "entryDate": {
        "$gte": target_date,
        "$lt": next_date
    }
})

Index Used: (userId, entryDate)
Use Case: Daily view or fetching today's entry

Update Entry

db.entries.update_one(
    { "_id": ObjectId(entry_id), "userId": ObjectId(user_id) },
    {
        "$set": {
            "title": new_title,
            "content": new_content,
            "mood": new_mood,
            "updatedAt": datetime.utcnow()
        }
    }
)

Delete Entry

db.entries.delete_one({
    "_id": ObjectId(entry_id),
    "userId": ObjectId(user_id)
})

Delete All User Entries (on account deletion)

db.entries.delete_many({ "userId": ObjectId(user_id) })

Data Types & Conversions

ObjectId

MongoDB Storage: ObjectId
Python Type: bson.ObjectId
JSON Representation: String (24-character hex)

Conversion:

from bson import ObjectId

# String to ObjectId
oid = ObjectId(string_id)

# ObjectId to String (for JSON responses)
string_id = str(oid)

# Check if valid ObjectId string
try:
    oid = ObjectId(potential_string)
except:
    # Invalid ObjectId
    pass

Datetime

MongoDB Storage: ISODate (UTC)
Python Type: datetime.datetime
JSON Representation: ISO 8601 string

Conversion:

from datetime import datetime

# Create UTC datetime
now = datetime.utcnow()

# ISO string to datetime
dt = datetime.fromisoformat(iso_string.replace("Z", "+00:00"))

# Datetime to ISO string
iso_string = dt.isoformat()

Migration from Old Schema

What Changed

Aspect Old Schema New Schema
Users Many per email possible One per email (unique)
User _id ObjectId (correct) ObjectId (unchanged)
Entry userId String ObjectId
Entry date Only createdAt createdAt + entryDate
Encryption Not supported Metadata in encryption field
Settings Separate collection Merged into users.theme
Indexes None Comprehensive indexes

Migration Steps

See MIGRATION_GUIDE.md for detailed instructions.

Quick Summary:

# 1. Backup database
mongodump --db grateful_journal --out ./backup

# 2. Run migration script
python backend/scripts/migrate_data.py

# 3. Create indexes
python backend/scripts/create_indexes.py

# 4. Verify data
python backend/scripts/verify_schema.py

Security

User Isolation

  • All entry queries filter by userId to ensure users only access their own data
  • Frontend enforces user_id matching via Firebase auth token
  • Backend validates ObjectId conversions

Encryption Ready

  • entries.encryption metadata prepares schema for future client-side encryption
  • Server stores encrypted content as-is without decryption
  • Client responsible for IV, algorithm, and decryption keys

Indexes & Performance

  • Compound indexes prevent full collection scans
  • Unique email index prevents user confusion
  • Pagination support prevents memory overload

Backup & Recovery

Backup

# Full database
mongodump --db grateful_journal --out ./backup-$(date +%Y%m%d-%H%M%S)

# Specific collection
mongodump --db grateful_journal --collection entries --out ./backup-entries

Restore

# Full database
mongorestore --db grateful_journal ./backup-2026-03-05-120000

# Specific collection
mongorestore --db grateful_journal ./backup-entries

FAQ

Q: Can I change the entryDate of an entry?

A: Yes. Send a PUT request with entryDate in the body. The entry will be re-indexed for calendar queries.

Q: How do I encrypt entry content?

A:

  1. Client encrypts content client-side using a key (not transmitted)
  2. Client sends encrypted content + metadata (iv, algorithm)
  3. Server stores content + encryption metadata as-is
  4. On retrieval, client decrypts using stored IV and local key

Q: What if I have duplicate users?

A: Run the migration script:

python backend/scripts/migrate_data.py

It detects duplicates, keeps the oldest, and consolidates entries.

Q: Should I paginate entries?

A: Yes. Use skip and limit to prevent loading thousands of entries:

GET /api/entries/{user_id}?skip=0&limit=50

Q: How do I query entries by date range?

A: Use the calendar endpoint or build a query:

db.entries.find({
    "userId": oid,
    "entryDate": {
        "$gte": start_date,
        "$lt": end_date
    }
})

References


For questions or issues, refer to the project README or open an issue on GitHub.