HANDOUTS

STATISTICS AND PROBABILITY:

An Introduction:
Distributed database design is of great importance for query processing because The definition of the fragments depends on the objective of increasing the reference location and Sometimes parallel execution of the most important queries.

### STA301 HIGHLIGHTED HANDOUTS

Cast role A query processor consists of defining a high-level query against a distributed database in a sequence
Database
operations on relationship parts.

STA301-STATISTICS AND PROBABILITY

Request processing problem:
The main function of the relational query processor is to convert a high-level query into an Equivalent lower-level query. The low-level query is already executing the query strategy. The transformation must achieve both health and efficiency. True if the low-level query has the same semantics as the original query; H.

if both Queries lead to the same result. Consider the following relationships EMP (Eno, Reward, Title) ASG(Eno, pNo, resp, major PROJ(pNo, pName, budget, location) Question: Get the names of the people who lead a project
and choose a reward By
EMP, ASG Where EMP.eNr = ASG.eNr else respect = ‘manager’.

#### STA301 HIGHLIGHTED HANDOUTS-STATISTICS AND PROBABILITY

Two equivalent relational algebra queries that are correct transformations of the query are above Name(σresp=’Manager’ ^ EMP.eNr = ASG.eNr) (EMPxASG) and Name (EMP (σresp=’Manager’ (ASG)))
It is clear that the second query avoids the Cartesian product of EMP and ASG.

Consumes
much less computing resources than before and therefore requires maintenance. central QP The extension to centralized query execution strategies can also be expressed in relational algebra. The main task of a centralized query processor is to make a specific choice Query, the best relational algebra query among all its equivalents.

##### STA301 HIGHLIGHTED HANDOUTS PDF-STATISTICS AND PROBABILITY

Relational algebra is insufficient to express execution strategies in distributed systems. He must be complemented by data exchanges between sites. The distributed QP should also determine the best sites for data processing and possibly the method.

The data needs to
be converted. This increases the area of ​​the solution in which a file can be selected Distributed Execution Strategy, which makes distributed query processing so much harder.