Databases types: SQL, NoSQL, Column, Search, Key Value | System Design Tutorials | Part 6 | 2020

Part 6 of Yogita Sharma System Design Tutorial: Databases types: SQL, NoSQL, Column, Search, Key Value.

This video covers:

  • Types of Databases
  • Pros and Cons
  • Use Cases
  • Examples

Depending on the properties of the data and the volume of data that may need to be queried - different types of databases are commended for use. Different types of databases include:

  • Relational
  • Non-relational
  • File
  • Network

Non-relational databases can be further subdivided into:

  • Key-Value Stores
  • Column based DBs
  • Document based DBs
  • Search DBs

RELATIONAL DATABASES

Regarding relational databases, two main properties determine which types of relational databases are best to use:

  1. Schema
  2. ACID

1. Schema

The schema describes how data will be structures. An example of constraints used in database creation include:

  • Primary Key
  • Not Null
  • Foreign Key
  • Int 
  • VARCHAR
  • Default

create table EMPLOYEES {

id INT PRIMARY KEY,

name VARCHAR(50) NOT NULL,

age INT,

department_id INT NOT NULL,

FOREIGN KEY (department_id) REFERENCES department (Id)

}

The database help design complex relations between data. Databases also prevent garbage data such as null value is not populated. As well as ensuring that all other schema constraints are being adhered to.

2. ACID

The acronym stands for:

  • Atomocity
    • Either a transaction is handled in its entirety or not at all
    • An example of this is moving money between accounts
  • Consistency
    • Data is... well consistent.
    • For example, if two separate read operations are run on the same data - they must not return different results.
  • Isolation
    • Processing of one transaction cannot depend on another, they should not know about each other.
    • For example, a read of data, followed by a write to adjust the data read - does not result in another read.
  • Durability
    • Data needs to persist
Transactional systems such as banking are suited to using relational databases. If on the other hand the schema is not fixed, then a non-relational database is much more appropriate. Relational databases are very easy to scale vertically. This means providing more computing resources, such as memory to store the data. Horizontal scaling, splitting the work between multiple machines, is more challenging with relational databases.

NON-RELATIONAL DATABASES (also known as NoSQL)

Non-relational databases do not have a fixed schema.
Key-Value pair databases consist of keys and values that are accessed via hash of the key. Caches use key : value pairs. Examples include: Redis, Dynamo DB, Memcached. Since key look-up is fast - access in key-value DBs is speedy and some DBs are stored in memory ensuring faster access than DBs using file read from storage.

Document based databases are used when the schema is not fixed and heavy reads and writes are part of common use case. Document DBs have collections and documents. The collection is like a table and the document is like a row. Having all the data in one place makes it easier to query as multiple tables do not need to be read. However, since there is no schema - parts of a document may have null values.  Another con to using document based DBs is that they do not guarantee ACID properties. This can be compensated for using application code. 

In summary, regarding NoSQL databses - here are the pros and cons.

Pros: 
  • Highly scaleable
  • Sharding
  • Dynamic Data Flexibility
  • Special query operations / aggregation
Cons:
  • Cannot guarantee that data does not include null values
  • Cannot guarantee ACID properties

Column Based DBs are a half-way solution between key : value pairs and document based DBs. They have fixed tables and schema, however they do not guarantee ACID properties. They are best used when there are heavy reads such as streaming data. Other examples include health tracking information or dta captured from IoT (Internet of Things) devices. The table structure is based on the types of reads expected. A music streaming service might have a column based DB with tables for users, songs, users_by_liked_songs, and songs_by_users_liked.
Examples of Column based DBs include: Cassandra, HBase, Scylla. 

Search databases contain information about queries and advances indexes. Examples of search based DBs include: Elastic Search, Solar. A search database is often not the primary data store, it references the primary data story and refreshes it's data based on popular queries. 


Other types of data are stored in specialist databases. Image and Videos are often stored in S3, and Bucket. Large data sets contain data from large number of users, data from IoT devices, or data captured over time. 


Sometimes it's obvious which database type is the best to use in a system. Other times it's not and the type of database may need to change as common use cases from customers become clear or the type of data entering the system would be better stored in a different type of database. Some large companies even create their own database solutons.


No comments:

Post a Comment