Files
grateful-journal/backend/SCHEMA.md
2026-03-09 10:54:07 +05:30

527 lines
13 KiB
Markdown

# Grateful Journal — MongoDB Schema Documentation
**Version:** 2.0 (Refactored)
**Last Updated:** 2026-03-05
---
## Overview
This document describes the refactored MongoDB schema for the Grateful Journal application. The schema has been redesigned to:
- Ensure one user per email (deduplicated)
- Use ObjectId references instead of strings
- Optimize queries for common operations (history pagination, calendar view)
- Prepare for client-side encryption
- Add proper indexes for performance
---
## Collections
### 1. `users` Collection
Stores user profile information. One document per unique email.
#### Schema
```javascript
{
_id: ObjectId,
email: string (unique),
displayName: string,
photoURL: string,
theme: "light" | "dark",
createdAt: Date,
updatedAt: Date
}
```
#### Field Descriptions
| Field | Type | Required | Notes |
| ------------- | -------- | -------- | ---------------------------------------- |
| `_id` | ObjectId | Yes | Unique primary key, auto-generated |
| `email` | String | Yes | User's email; unique constraint; indexed |
| `displayName` | String | Yes | User's display name (from Google Auth) |
| `photoURL` | String | No | User's profile photo URL |
| `theme` | String | Yes | Theme preference: "light" or "dark" |
| `createdAt` | Date | Yes | Account creation timestamp |
| `updatedAt` | Date | Yes | Last profile update timestamp |
#### Unique Constraints
- `email`: Unique index ensures one user per email address
#### Example Document
```json
{
"_id": ObjectId("507f1f77bcf86cd799439011"),
"email": "jeet.debnath2004@gmail.com",
"displayName": "Jeet Debnath",
"photoURL": "https://lh3.googleusercontent.com/a/ACg8...",
"theme": "light",
"createdAt": ISODate("2026-03-04T06:51:32.598Z"),
"updatedAt": ISODate("2026-03-05T10:30:00.000Z")
}
```
---
### 2. `entries` Collection
Stores journal entries for each user. Each entry has a logical journal date and optional encryption metadata.
#### Schema
```javascript
{
_id: ObjectId,
userId: ObjectId,
title: string,
content: string,
mood: "happy" | "sad" | "neutral" | "anxious" | "grateful" | null,
tags: string[],
isPublic: boolean,
entryDate: Date, // Logical journal date
createdAt: Date,
updatedAt: Date,
encryption: {
encrypted: boolean,
iv: string | null, // Base64-encoded initialization vector
algorithm: string | null // e.g., "AES-256-GCM"
}
}
```
#### Field Descriptions
| Field | Type | Required | Notes |
| ------------ | -------- | -------- | ----------------------------------------- |
| `_id` | ObjectId | Yes | Entry ID; auto-generated; indexed |
| `userId` | ObjectId | Yes | Reference to user.\_id; indexed; enforced |
| `title` | String | Yes | Entry title/headline |
| `content` | String | Yes | Entry body content |
| `mood` | String | No | Mood selector (null if not set) |
| `tags` | Array | Yes | Array of user-defined tags [] |
| `isPublic` | Bool | Yes | Public sharing flag (currently unused) |
| `entryDate` | Date | Yes | Logical journal date (start of day, UTC) |
| `createdAt` | Date | Yes | Database write timestamp |
| `updatedAt` | Date | Yes | Last modification timestamp |
| `encryption` | Object | Yes | Encryption metadata (nested) |
#### Encryption Metadata
```javascript
{
encrypted: boolean, // If true, content is encrypted
iv: string | null, // Base64 initialization vector
algorithm: string | null // Encryption algorithm name
}
```
**Notes:**
- `encrypted: false` by default (plain text storage)
- When setting `encrypted: true`, client provides `iv` and `algorithm`
- Server stores metadata but does NOT decrypt; decryption happens client-side
#### Example Document
```json
{
"_id": ObjectId("507f1f77bcf86cd799439012"),
"userId": ObjectId("507f1f77bcf86cd799439011"),
"title": "Today's Gratitude",
"content": "I'm grateful for my family, coffee, and a good day at work.",
"mood": "grateful",
"tags": ["family", "work", "coffee"],
"isPublic": false,
"entryDate": ISODate("2026-03-05T00:00:00.000Z"),
"createdAt": ISODate("2026-03-05T12:30:15.123Z"),
"updatedAt": ISODate("2026-03-05T12:30:15.123Z"),
"encryption": {
"encrypted": false,
"iv": null,
"algorithm": null
}
}
```
---
## Indexes
Indexes optimize query performance. All indexes are created by the `scripts/create_indexes.py` script.
### Users Indexes
```javascript
// Unique index on email (prevents duplicates)
db.users.createIndex({ email: 1 }, { unique: true });
// For sorting users by creation date
db.users.createIndex({ createdAt: -1 });
```
### Entries Indexes
```javascript
// Compound index for history pagination (most recent first)
db.entries.createIndex({ userId: 1, createdAt: -1 });
// Compound index for calendar queries by date
db.entries.createIndex({ userId: 1, entryDate: 1 });
// For tag-based searches (future feature)
db.entries.createIndex({ tags: 1 });
// For sorting by entry date
db.entries.createIndex({ entryDate: -1 });
```
### Index Rationale
- **`(userId, createdAt)`**: Supports retrieving a user's entries in reverse chronological order with pagination
- **`(userId, entryDate)`**: Supports calendar view queries (entries for a specific month/date)
- **`tags`**: Supports future tag filtering/search
- **`entryDate`**: Supports standalone date-range queries
---
## Query Patterns
### User Queries
#### Find or Create User (Upsert)
```python
db.users.update_one(
{ "email": email },
{
"$setOnInsert": {
"email": email,
"displayName": displayName,
"photoURL": photoURL,
"theme": "light",
"createdAt": datetime.utcnow()
},
"$set": {
"updatedAt": datetime.utcnow()
}
},
upsert=True
)
```
**Why:** Ensures exactly one user per email. Frontend calls this after any Firebase login.
#### Get User by Email
```python
user = db.users.find_one({ "email": email })
```
**Index Used:** Unique index on `email`
---
### Entry Queries
#### Create Entry
```python
db.entries.insert_one({
"userId": ObjectId(user_id),
"title": title,
"content": content,
"mood": mood,
"tags": tags,
"isPublic": False,
"entryDate": entry_date, # Start of day UTC
"createdAt": datetime.utcnow(),
"updatedAt": datetime.utcnow(),
"encryption": {
"encrypted": False,
"iv": None,
"algorithm": None
}
})
```
#### Get Entries for User (Paginated, Recent First)
```python
entries = db.entries.find(
{ "userId": ObjectId(user_id) }
).sort("createdAt", -1).skip(skip).limit(limit)
```
**Index Used:** `(userId, createdAt)`
**Use Case:** History page with pagination
#### Get Entries by Month (Calendar View)
```python
start_date = datetime(year, month, 1)
end_date = datetime(year, month + 1, 1)
entries = db.entries.find({
"userId": ObjectId(user_id),
"entryDate": {
"$gte": start_date,
"$lt": end_date
}
}).sort("entryDate", -1)
```
**Index Used:** `(userId, entryDate)`
**Use Case:** Calendar view showing entries for a specific month
#### Get Entry for Specific Date
```python
target_date = datetime(year, month, day)
next_date = target_date + timedelta(days=1)
entries = db.entries.find({
"userId": ObjectId(user_id),
"entryDate": {
"$gte": target_date,
"$lt": next_date
}
})
```
**Index Used:** `(userId, entryDate)`
**Use Case:** Daily view or fetching today's entry
#### Update Entry
```python
db.entries.update_one(
{ "_id": ObjectId(entry_id), "userId": ObjectId(user_id) },
{
"$set": {
"title": new_title,
"content": new_content,
"mood": new_mood,
"updatedAt": datetime.utcnow()
}
}
)
```
#### Delete Entry
```python
db.entries.delete_one({
"_id": ObjectId(entry_id),
"userId": ObjectId(user_id)
})
```
#### Delete All User Entries (on account deletion)
```python
db.entries.delete_many({ "userId": ObjectId(user_id) })
```
---
## Data Types & Conversions
### ObjectId
**MongoDB Storage:** `ObjectId`
**Python Type:** `bson.ObjectId`
**JSON Representation:** String (24-character hex)
**Conversion:**
```python
from bson import ObjectId
# String to ObjectId
oid = ObjectId(string_id)
# ObjectId to String (for JSON responses)
string_id = str(oid)
# Check if valid ObjectId string
try:
oid = ObjectId(potential_string)
except:
# Invalid ObjectId
pass
```
### Datetime
**MongoDB Storage:** ISODate (UTC)
**Python Type:** `datetime.datetime`
**JSON Representation:** ISO 8601 string
**Conversion:**
```python
from datetime import datetime
# Create UTC datetime
now = datetime.utcnow()
# ISO string to datetime
dt = datetime.fromisoformat(iso_string.replace("Z", "+00:00"))
# Datetime to ISO string
iso_string = dt.isoformat()
```
---
## Migration from Old Schema
### What Changed
| Aspect | Old Schema | New Schema |
| ------------ | ----------------------- | ------------------------------ |
| Users | Many per email possible | One per email (unique) |
| User \_id | ObjectId (correct) | ObjectId (unchanged) |
| Entry userId | String | ObjectId |
| Entry date | Only `createdAt` | `createdAt` + `entryDate` |
| Encryption | Not supported | Metadata in `encryption` field |
| Settings | Separate collection | Merged into `users.theme` |
| Indexes | None | Comprehensive indexes |
### Migration Steps
See [MIGRATION_GUIDE.md](./MIGRATION_GUIDE.md) for detailed instructions.
**Quick Summary:**
```bash
# 1. Backup database
mongodump --db grateful_journal --out ./backup
# 2. Run migration script
python backend/scripts/migrate_data.py
# 3. Create indexes
python backend/scripts/create_indexes.py
# 4. Verify data
python backend/scripts/verify_schema.py
```
---
## Security
### User Isolation
- All entry queries filter by `userId` to ensure users only access their own data
- Frontend enforces user_id matching via Firebase auth token
- Backend validates ObjectId conversions
### Encryption Ready
- `entries.encryption` metadata prepares schema for future client-side encryption
- Server stores encrypted content as-is without decryption
- Client responsible for IV, algorithm, and decryption keys
### Indexes & Performance
- Compound indexes prevent full collection scans
- Unique email index prevents user confusion
- Pagination support prevents memory overload
---
## Backup & Recovery
### Backup
```bash
# Full database
mongodump --db grateful_journal --out ./backup-$(date +%Y%m%d-%H%M%S)
# Specific collection
mongodump --db grateful_journal --collection entries --out ./backup-entries
```
### Restore
```bash
# Full database
mongorestore --db grateful_journal ./backup-2026-03-05-120000
# Specific collection
mongorestore --db grateful_journal ./backup-entries
```
---
## FAQ
### Q: Can I change the entryDate of an entry?
**A:** Yes. Send a PUT request with `entryDate` in the body. The entry will be re-indexed for calendar queries.
### Q: How do I encrypt entry content?
**A:**
1. Client encrypts content client-side using a key (not transmitted)
2. Client sends encrypted content + metadata (iv, algorithm)
3. Server stores content + encryption metadata as-is
4. On retrieval, client decrypts using stored IV and local key
### Q: What if I have duplicate users?
**A:** Run the migration script:
```bash
python backend/scripts/migrate_data.py
```
It detects duplicates, keeps the oldest, and consolidates entries.
### Q: Should I paginate entries?
**A:** Yes. Use `skip` and `limit` to prevent loading thousands of entries:
```
GET /api/entries/{user_id}?skip=0&limit=50
```
### Q: How do I query entries by date range?
**A:** Use the calendar endpoint or build a query:
```python
db.entries.find({
"userId": oid,
"entryDate": {
"$gte": start_date,
"$lt": end_date
}
})
```
---
## References
- [FastAPI Backend Routes](../routers/)
- [Pydantic Models](../models.py)
- [Migration Script](../scripts/migrate_data.py)
- [Index Creation Script](../scripts/create_indexes.py)
- [MongoDB Documentation](https://docs.mongodb.com/)
---
_For questions or issues, refer to the project README or open an issue on GitHub._