Course Description
This course provides students with in-depth knowledge of advanced database techniques and data structures used in relational and NoSQL databases. It focuses on writing complex SQL queries, understanding when to use relational vs. non-relational databases, and managing distributed data storage for efficient parallel processing in large-scale systems. The course also explores methods for optimizing queries and ensuring data consistency across distributed environments.

With the increasing demands for high-performance data management, this course introduces students to concepts relevant to HPC and AI-driven data processing. Topics such as data indexing, partitioning, replication, and distributed database architectures help students understand how to store and retrieve massive datasets efficiently. The course also covers NoSQL solutions, providing insights into key-value, document, wide-column, and graph-based databases, which are widely used in AI applications, large-scale analytics, and real-time processing.
Course Content Overview (12 Modules)
- Introduction to Advanced Databases – Overview of relational and NoSQL databases, scalability considerations.
- SQL Query Optimization – Techniques for improving query execution performance and reducing computational costs.
- Indexing Strategies – Accelerating database searches using B-trees, hash indexes, and bitmap indexes.
- Data Partitioning – Techniques for distributing large datasets across multiple servers to improve performance.
- Data Replication – Ensuring data availability and consistency in distributed systems.
- Database Security and Reliability – Access control, encryption, backup strategies, and fault tolerance.
- Distributed Databases – Architectures, data synchronization, and consistency models.
- Complex Data Structures – Advanced relational models, hierarchical structures, and semi-structured data.
- Introduction to NoSQL Databases – Key characteristics and comparison with relational databases.
- Key-Value and Document Databases – Working with Redis and MongoDB for high-speed and flexible data storage.
- Wide-Column Databases – Exploring Cassandra for handling massive-scale structured data.
- Graph Databases – Using Neo4j for managing complex relationships and AI-powered applications.
Learning Outcomes
- Write and optimize complex SQL queries for efficient data retrieval and processing.
- Differentiate between relational and NoSQL databases and determine their appropriate use cases.
- Implement distributed storage solutions for handling large-scale data and parallel processing.
- Apply advanced indexing, partitioning, and replication techniques to optimize database performance.
- Work with NoSQL databases such as Redis, MongoDB, Cassandra, and Neo4j for scalable data management.