The Parallel machines are becoming quite common and affordable nowadays. The Prices of microprocessors, memory and disks have dropped sharply. Recent computers characterize multiple processors and the trend is projected to accelerate. Databases are growing extremely large. Large amount of transaction data are collected and stored for further analysis. Multimedia objects like images are increasingly stored in databases. Large-scale parallel database systems increasingly used for. Storing large amount of data, processes time-consuming decision-support queries, In transaction processing it, provide high throughput.
Data can be partitioned across multiple disks for I/O in parallel. Individual relational operations like sort, join, aggregation can be executed in parallel. Data can be partitioned separately and each of the processor can work independently on its own partition. Queries are expressed in the form of high level language (translated to relational algebra, SQL) makes parallelization much easier. Different queries can be run in parallel with each other simultaneously. Concurrency control takes care of conflicts that are likely to occur. Thus, databases lend themselves to parallelism.
Parallel Databases included:
1. I/O Parallelism
2. Interquery Parallelism
3. Intraquery Parallelism
4. Intraoperation Parallelism
5. Interoperation Parallelism
I/O Parallelism reduces the time required to retrieve the relations from disk by partitioning the relations on several multiple disks. This includes Horizontal partitioning that is tuples of a relation are divided among several disks such that each tuple resides on one disk.
In Interquery Parallelism transactions execute in parallel with one another. This increases transaction throughput which is used primarily to scale up a transaction processing system to support a larger number of transactions per second. It is the easiest form of parallelism to support, particularly in a shared-memory parallel database, since the sequential database systems support concurrent processing. Interquery parallelism is more complicated to implement on shared-disk or shared-nothing architectures.
1. Locking and logging should be coordinated by passing messages between two or more processors.
2. Data in a local buffer may have been updated at another processor.
3. Cache-coherency has to be maintained i.e reads and writes of data in buffer should find latest version of data.
Intraquery Parallelism includes execution of a single query in parallel on multiple processors and also important for speeding up long-running queries.
Two complementary forms of intraquery parallelism:
1. Intraoperation Parallelism – This includes parallelize the execution of each individual operation in the query.
2 .Interoperation Parallelism – This includes execute the different operations in a query expression in parallel.
The first form of intraquery parallelism scales better with increased parallelism because the number of tuples processed by each operation is generally more than the number of operations in a query.