Class 27: Database Fundamentals - Relational Databases & SQL
So far, our backend API has been using an in-memory array to store data. While useful for learning, this data is lost every time the server restarts. To build real-world applications, we need a way to store data persistently. This is where databases come in.
Today, we'll introduce the fundamentals of databases, focusing on Relational Databases and their standard query language, SQL.
Introduction to Databases
-
Why do we need databases?
- Persistence: Data remains stored even after the application or server shuts down.
- Organization: Data is structured and organized in a way that makes it easy to store, retrieve, and manage.
- Scalability: Databases are designed to handle large volumes of data and many concurrent users.
- Integrity: Mechanisms to ensure data is accurate and consistent.
- Security: Features for access control and data protection.
- Definition: A database is an organized collection of structured information, or data, typically stored electronically in a computer system.
Types of Databases (Overview)
Databases can be broadly categorized into two main types:
-
Relational Databases (SQL):
Store data in tables (relations) with predefined schemas. They use SQL (Structured Query Language) for data definition and manipulation. Excellent for structured data where relationships between entities are important.
-
Non-Relational Databases (NoSQL):
Store data in various formats (document, key-value, column-family, graph) and offer more flexibility with schemas. Designed for high scalability, flexibility, and performance for specific use cases. We'll cover these in a later class.
Relational Database Management Systems (RDBMS)
An RDBMS is a program that allows you to create, update, and administer a relational database.
Core Concepts:
-
Tables (Relations):
The fundamental structure in an RDBMS. Data is organized into rows and columns, similar to a spreadsheet. Each table represents a specific entity (e.g., Users, Products, Orders).
-
Rows (Records/Tuples):
A single entry of data within a table. Each row represents one instance of the entity.
-
Columns (Fields/Attributes):
Individual data points within a row. Each column has a specific data type (e.g., TEXT, INTEGER, BOOLEAN, DATE).
-
Primary Key (PK):
A column (or set of columns) that uniquely identifies each row in a table. It cannot contain NULL values and must contain unique values. (e.g.,
user_id
,product_id
). -
Foreign Key (FK):
A column (or set of columns) in one table that refers to the Primary Key in another table. Foreign keys establish relationships between tables.
-
Schema:
The logical structure of the entire database. It defines the tables, columns, data types, relationships, and constraints.
Popular RDBMS:
- PostgreSQL: Powerful, open-source, object-relational database system known for its reliability, feature robustness, and performance.
- MySQL: Very popular open-source relational database, widely used for web applications.
- SQLite: A self-contained, serverless, zero-configuration, transactional SQL database engine. Ideal for local development, testing, or embedded applications.
- Oracle Database: Commercial, enterprise-grade database system.
- SQL Server (Microsoft): Commercial, enterprise-grade database system from Microsoft.
Introduction to SQL (Structured Query Language)
SQL is the standard language for managing and manipulating relational databases. It's used to communicate with the database to store, retrieve, modify, and delete data.
Categories of SQL commands:
-
DDL (Data Definition Language):
Commands used to define, modify, or drop the structure of database objects (tables, indexes, views, etc.).
CREATE TABLE
: Create a new table.-
ALTER TABLE
: Modify an existing table. DROP TABLE
: Delete a table.
-
DML (Data Manipulation Language):
Commands used for managing data within schema objects.
-
SELECT
: Retrieve data from a database. -
INSERT INTO
: Insert new data into a table. -
UPDATE
: Modify existing data in a table. -
DELETE FROM
: Delete data from a table.
-
-
DCL (Data Control Language):
Commands used to control access to data in the database.
-
GRANT
: Give user access privileges. -
REVOKE
: Remove user access privileges.
-
-
TCL (Transaction Control Language):
Commands used to manage transactions in the database.
COMMIT
: Save changes permanently.ROLLBACK
: Undo changes.
Basic SQL Queries (DML)
Let's look at some fundamental DML queries.
SELECT
: Retrieving Data
Used to retrieve data from one or more tables.
-- Select all columns from the 'users' table
SELECT * FROM users;
-- Select specific columns (name, email) from the 'users' table
SELECT name, email FROM users;
WHERE
: Filtering Rows
Used to filter records based on a specified condition.
-- Select users where age is greater than 30
SELECT * FROM users WHERE age > 30;
-- Select products with price less than or equal to 50
SELECT name, price FROM products WHERE price <= 50;
-- Select users named 'Alice'
SELECT * FROM users WHERE name = 'Alice';
-- Select products whose name starts with 'L' (case-insensitive depending on DB)
SELECT * FROM products WHERE name LIKE 'L%';
-- Select users whose ID is 1, 3, or 5
SELECT * FROM users WHERE id IN (1, 3, 5);
-- Combine conditions with AND/OR
SELECT * FROM users WHERE age > 25 AND city = 'New York';
SELECT * FROM products WHERE category = 'Electronics' OR price > 1000;
ORDER BY
: Sorting Results
Used to sort the result-set in ascending (ASC
) or
descending (DESC
) order.
-- Select all users, ordered by name alphabetically (ascending)
SELECT * FROM users ORDER BY name ASC;
-- Select products, ordered by price from highest to lowest
SELECT name, price FROM products ORDER BY price DESC;
LIMIT
and OFFSET
: Paginating Results
Used to restrict the number of rows returned by a query and to skip a certain number of rows. Useful for pagination.
-- Select only the first 10 users
SELECT * FROM users LIMIT 10;
-- Select 10 users, starting from the 21st user (for page 3, if each page has 10 items)
SELECT * FROM users LIMIT 10 OFFSET 20;
Database Setup (e.g., SQLite for local practice)
For local practice and quick prototyping, SQLite is an excellent choice because it's a file-based database (no server process needed) and requires zero configuration.
Steps to set up SQLite:
-
Install DB Browser for SQLite (Recommended
GUI):
This free, open-source tool provides a graphical interface to create, design, and browse SQLite database files. Download it from sqlitebrowser.org/dl/.
-
Create a new database file:
Open DB Browser for SQLite, go to "File" > "New Database...", and save it as
my_database.db
in your project directory (e.g., in adata/
folder). -
Create a basic table:
Go to the "Execute SQL" tab and run the following DDL query to create a
books
table:CREATE TABLE books ( id INTEGER PRIMARY KEY AUTOINCREMENT, title TEXT NOT NULL, author TEXT NOT NULL, year INTEGER );
Click the "Execute" button (play icon). Then, go to the "Database Structure" tab and refresh to see your new table.
-
Insert some sample data:
In the "Execute SQL" tab, run these INSERT statements:
INSERT INTO books (title, author, year) VALUES ('The Hitchhiker''s Guide to the Galaxy', 'Douglas Adams', 1979); INSERT INTO books (title, author, year) VALUES ('1984', 'George Orwell', 1949); INSERT INTO books (title, author, year) VALUES ('To Kill a Mockingbird', 'Harper Lee', 1960);
-
Query your data:
In the "Execute SQL" tab, run a SELECT query:
SELECT * FROM books;
You should see the data you just inserted.
This setup gives you a working relational database to interact with. In the next class, we'll dive into more advanced SQL concepts like joins and database design, which are essential for building robust applications.