Building a Social Media Platform with PostgreSQL: A Final Project
Welcome to our final project in the PostgreSQL learning series!
We'll build a simple social media platform similar to Reddit or X, where users can create posts, like or downvote posts, and comment on them.
We'll apply all the concepts we've learned, including compound keys, foreign keys, and various SQL operations. This will hopefully help you see some of the database techniques applied practically and give you a better mental model.
1. Designing the Database Schema
The first step in creating our social media platform is designing the database schema. We must think carefully about what information we want to store and how different pieces of data relate.
Let's start by identifying the leading entities in our system:
- Users
- Posts
- Likes (and Dislikes)
- Comments
Now, let's create tables for each of these entities. Take your time to read through each table to try and understand a guess why we made some of the decisions:
-- Users table CREATE TABLE Users ( UserID SERIAL PRIMARY KEY, Username VARCHAR(50) UNIQUE NOT NULL, Email VARCHAR(100) UNIQUE NOT NULL, CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- Posts table CREATE TABLE Posts ( PostID SERIAL PRIMARY KEY, UserID INT NOT NULL, Content VARCHAR(280) NOT NULL, CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (UserID) REFERENCES Users(UserID) ); -- Likes table (using a compound key) CREATE TABLE Likes ( UserID INT, PostID INT, IsLike BOOLEAN NOT NULL, -- TRUE for like, FALSE for downvote CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (UserID, PostID), FOREIGN KEY (UserID) REFERENCES Users(UserID), FOREIGN KEY (PostID) REFERENCES Posts(PostID) ); -- Comments table CREATE TABLE Comments ( CommentID SERIAL PRIMARY KEY, PostID INT NOT NULL, UserID INT NOT NULL, Content VARCHAR(280) NOT NULL, CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (PostID) REFERENCES Posts(PostID), FOREIGN KEY (UserID) REFERENCES Users(UserID) );
Let's break down each table and explain the design choices:
Users Table
UserID
is the primary key. We useSERIAL
to automatically generate unique IDs.Username
andEmail
are set asUNIQUE
to ensure no two users have the same username or email.CreatedAt
usesDEFAULT CURRENT_TIMESTAMP
to automatically record when the user was created.
Posts Table
PostID
is the primary key, again usingSERIAL
.UserID
is a foreign key referencing the Users table. This creates a relationship between posts and users.Content
is limited to 280 characters, similar to Twitter's limit.
Likes Table
- This table uses a compound key (
UserID
andPostID
) as its primary key. This ensures a user can only like or downvote a post once. IsLike
is a boolean field.TRUE
represents a like,FALSE
represents a downvote.- We use foreign keys to link to both Users and Posts tables.
Comments Table
CommentID
is the primary key.- We use foreign keys to link each comment to both a post (
PostID
) and a user (UserID
).
2. Indexing for Performance
Now that we have our basic schema let's think about performance. As our social media platform grows, certain queries might become slow. Indexing is a technique we can use to speed up these queries.
-- Index on Posts(UserID) for faster user-specific post queries CREATE INDEX idx_posts_userid ON Posts(UserID); -- Index on Comments(PostID) for faster comment retrieval CREATE INDEX idx_comments_postid ON Comments(PostID); -- Index on Likes(PostID) for faster like counting CREATE INDEX idx_likes_postid ON Likes(PostID); -- Index on Users(Username) for faster username lookups CREATE INDEX idx_users_username ON Users(Username); -- Index on Posts(CreatedAt) for faster date-based queries CREATE INDEX idx_posts_createdat ON Posts(CreatedAt);
Let's explain each index:
idx_posts_userid
: This will speed up queries that look for posts by a specific user.idx_comments_postid
: This will make finding all comments for a specific post faster.idx_likes_postid
: This will help us quickly count likes for a post.idx_users_username
: This will speed up searches for users by username.idx_posts_createdat
: This will help with queries that filter or sort posts by date.
Remember, while indexes speed up reads, they can slow down writes (inserts, updates, deletes). It's a trade-off, so we're indexing columns that we expect to frequently use in WHERE clauses or JOIN conditions.
3. Populating the Database
Now that we have our schema, let's add some sample data. This will help us test our queries later.
-- Insert Users INSERT INTO Users (Username, Email) VALUES ('john_doe', 'john@example.com'), ('jane_smith', 'jane@example.com'), ('bob_johnson', 'bob@example.com'); -- Insert Posts INSERT INTO Posts (UserID, Content) VALUES (1, 'Hello, PostgreSQL world!'), (2, 'Learning about compound keys today.'), (3, 'Who else loves SQL?'); -- Insert Likes INSERT INTO Likes (UserID, PostID, IsLike) VALUES (2, 1, TRUE), -- Jane likes John's post (3, 1, TRUE), -- Bob likes John's post (1, 2, FALSE); -- John downvotes Jane's post -- Insert Comments INSERT INTO Comments (PostID, UserID, Content) VALUES (1, 2, 'Great first post, John!'), (1, 3, 'Welcome to the community!'), (2, 1, 'Compound keys are handy.');
4. CRUD Operations
CRUD stands for Create, Read, Update, and Delete. These are the four basic operations we can perform on our data. Let's go through each operation for our different entities.
Users
Create a new user:
INSERT INTO Users (Username, Email) VALUES ('alice_wonder', 'alice@example.com');
This adds a new user to our Users table.
Read user information:
SELECT * FROM Users WHERE Username = 'john_doe';
This retrieves all information about the user 'john_doe'.
Update a user's email:
UPDATE Users SET Email = 'newemail@example.com' WHERE Username = 'john_doe';
This changes John Doe's email address.
Delete a user:
DELETE FROM Users WHERE Username = 'alice_wonder';
This removes Alice from our Users table.
Posts
Create a new post:
INSERT INTO Posts (UserID, Content) VALUES (1, 'Just created my first PostgreSQL database!');
This adds a new post for user with UserID 1.
Read posts:
-- Get all posts SELECT * FROM Posts; -- Get posts from a specific user SELECT * FROM Posts WHERE UserID = 1;
The first query retrieves all posts, while the second gets posts only from user 1.
Update a post:
UPDATE Posts SET Content = 'Updated: Just created my first PostgreSQL database!' WHERE PostID = 4;
This modifies the content of the post with PostID 4.
Delete a post:
DELETE FROM Posts WHERE PostID = 4;
This removes the post with PostID 4.
Likes
Create a like:
INSERT INTO Likes (UserID, PostID, IsLike) VALUES (1, 2, TRUE);
This adds a like from user 1 to post 2.
Read likes:
-- Get all likes for a post SELECT * FROM Likes WHERE PostID = 1; -- Count likes and downvotes for a post SELECT SUM(CASE WHEN IsLike THEN 1 ELSE 0 END) as Likes, SUM(CASE WHEN NOT IsLike THEN 1 ELSE 0 END) as Downvotes FROM Likes WHERE PostID = 1;
The first query gets all likes for post 1. The second query counts the likes and downvotes for post 1.
Update a like (change to downvote):
UPDATE Likes SET IsLike = FALSE WHERE UserID = 1 AND PostID = 2;
This changes user 1's like on post 2 to a downvote.
Delete a like:
DELETE FROM Likes WHERE UserID = 1 AND PostID = 2;
This removes user 1's like/downvote from post 2.
Comments
Create a comment:
INSERT INTO Comments (PostID, UserID, Content) VALUES (2, 3, 'I agree, compound keys are very useful!');
This adds a new comment to post 2 from user 3.
Read comments:
-- Get all comments for a post SELECT * FROM Comments WHERE PostID = 1;
This retrieves all comments for post 1.
Update a comment:
UPDATE Comments SET Content = 'Updated: I agree, compound keys are incredibly useful!' WHERE CommentID = 4;
This modifies the content of comment 4.
Delete a comment:
DELETE FROM Comments WHERE CommentID = 4;
This removes comment 4.
5. Complex Queries using JOINs
Now, let's look at some more complex queries that join multiple tables. JOINs allow us to combine data from different tables based on related columns.
Get all posts with their like counts and comment counts:
SELECT p.PostID, p.Content, u.Username, COUNT(DISTINCT l.UserID) as LikeCount, COUNT(DISTINCT c.CommentID) as CommentCount FROM Posts p JOIN Users u ON p.UserID = u.UserID LEFT JOIN Likes l ON p.PostID = l.PostID AND l.IsLike = TRUE LEFT JOIN Comments c ON p.PostID = c.PostID GROUP BY p.PostID, p.Content, u.Username ORDER BY p.CreatedAt DESC;
This complex query does several things:
- It starts with the Posts table and joins it with the Users table to get the username of each post's author.
- It then left joins with the Likes table to count likes (not downvotes).
- It also left joins with the Comments table to count comments.
- The GROUP BY clause is used because we're using aggregate functions (COUNT).
- Finally, it orders the results by the post creation date, showing newest posts first.
Get all comments for a post, including commenter's username:
SELECT c.CommentID, c.Content, u.Username, c.CreatedAt FROM Comments c JOIN Users u ON c.UserID = u.UserID WHERE c.PostID = 1 ORDER BY c.CreatedAt;
This query:
- Starts with the Comments table.
- Joins it with the Users table to get the username of each commenter.
- Filters for comments on a specific post (PostID = 1).
- Orders the results by the comment creation date.
You've now built a pretty basic social media platform using PostgreSQL. This project has demonstrated the use of:
- Database schema design, including the use of primary keys, foreign keys, and compound keys
- Indexing for query performance optimization
- CRUD operations on multiple entities
- Complex queries using JOINs
Remember, this is just a starting point. A real-world social media platform would have many more features and optimizations. As you continue to learn and practice, you can expand on this project by adding features like:
- User authentication and password hashing
- Hashtag functionality for posts
- User follower/following relationships
- User profile information BONUS) If you know how to do it, add this database design to an API and see how all these skills fit together. If not, I'll be creating a backend series very soon!
Keep experimenting with PostgreSQL, and don't be afraid to try out new queries and database designs. The more you practice, the more comfortable you'll become with these concepts.
These are just your first steps. The more you try to break things, the more you'll gain confidence and ability with these skills.
Learning any new skill will only be perfected with consistent practice.
Until the next one, happy coding!