Query algebra program databases
A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. Use of this web site signifies your agreement to the terms and conditions. A query algebra for program databases Abstract: Querying source code is an essential aspect of a variety of software engineering tasks such as program understanding, reverse engineering, program structure analysis and program flow analysis.
In this paper, we present and demonstrate the use of an algebraic source code query technique that blends expressive power with query compactness. The query framework of Source Code Algebra SCA permits users to express complex source code queries and views as algebraic expressions. Queries are expressed on an extensible, object-oriented database that stores program source code.
The SCA algebraic approach offers multiple benefits such as an applicative query language, high expressive power, seamless handling of structural and flow information, clean formalism and potential for query optimization.
Query Optimization: A single query can be executed through different algorithms or re-written in different forms and structures.
Hence, the question of query optimization comes into the picture — Which of these forms or pathways is the most optimal? The query optimizer attempts to determine the most efficient way to execute a given query by considering the possible query plans. Importance: The goal of query optimization is to reduce the system resources required to fulfill a query, and ultimately provide the user with the correct result set faster.
First, it provides the user with faster results, which makes the application seem faster to the user. Secondly, it allows the system to service more queries in the same amount of time, because each request takes less time than unoptimized queries. Thirdly, query optimization ultimately reduces the amount of wear on the hardware e. There are broadly two ways a query can be optimized:.
Analyze and transform equivalent relational expressions. Here, we shall talk about generating minimal equivalent expressions. To analyze equivalent expression, listed are a set of equivalence rules. These generate equivalent expressions for a query written in relational algebra. To optimize a query, we must convert the query into its equivalent form as long as an equivalence rule is satisfied. Conjunctive selection operations can be written as a sequence of individual selections.
This is called a sigma-cascade. Explanation: Applying condition intersection is expensive. Instead, filter out tuples satisfying condition inner selection and then apply condition outer selection to the then resulting fewer tuples.
This leaves us with less tuples to process the second time. This can be extended for two or more intersecting selections. Selection is commutative. Explanation: condition is commutative in nature. This means, it does not matter whether we apply first or first.
In practice, it is better and more optimal to apply that selection first which yields a fewer number of tuples. This saves time on our outer selection. All following projections can be omitted, only the first projection is required.
This is called a pi-cascade. Explanation: A cascade or a series of projections is meaningless. This is because in the end, we are only selecting those columns which are specified in the last, or the outermost projection. Hence, it is better to collapse all the projections into just one i.
Selections on Cartesian Products can be re-written as Theta Joins. Equivalence 1 Explanation: The cross product operation is known to be very expensive.
This is because it matches each tuple of E1 total m tuples with each tuple of E2 total n tuples. Instead of doing all of this, it is more optimal to use the Theta Join, a join specifically designed to select only those entries in the cross product which satisfy the Theta condition, without evaluating the entire cross product first. Equivalence 2 Explanation: Theta Join radically decreases the number of resulting tuples, so if we apply an intersection of both the join conditions i.
On the other hand, a condition outside unnecessarily increases the tuples to scan. Theta Joins are commutative. Explanation: Theta Joins are commutative, and the query processing time depends to some extent which table is used as the outer loop and which one is used as the inner loop during the join process based on the indexing structures and blocks.
Join operations are associative. Natural Join Explanation: Joins are all commutative as well as associative, so one must join those two tables first which yield less number of entries, and then apply the other join. Theta Join Explanation: Theta Joins are associative in the above manner, where involves attributes from only E2 and E3.
0コメント