Thursday, July 14, 2005

Grosch's law

In the 1960’s, Herb Grosch observed that there is
an economy-of-scale in computing. At that time, expensive computers were much more
powerful than inexpensive computers. This gave rise to super-linear speedups and scaleups.

Tuesday, July 12, 2005

branch and bound

Definition: An algorithmic technique to find the optimal solution by keeping the best solution found so far. If a partial solution cannot improve on the best, it is abandoned.

BGP protocol

excerpt from http://www.cisco.com/univercd/cc/td/doc/cisintwk/ito_doc/bgp.htm

The Border Gateway Protocol (BGP) is an interautonomous system routing protocol. An autonomous system is a network or group of networks under a common administration and with common routing policies. BGP is used to exchange routing information for the Internet and is the protocol used between Internet service providers (ISP). Customer networks, such as universities and corporations, usually employ an Interior Gateway Protocol (IGP) such as RIP or OSPF for the exchange of routing information within their networks. Customers connect to ISPs, and ISPs use BGP to exchange customer and ISP routes. When BGP is used between autonomous systems (AS), the protocol is referred to as External BGP (EBGP). If a service provider is using BGP to exchange routes within an AS, then the protocol is referred to as Interior BGP (IBGP).

Thursday, July 07, 2005

Occam's razor

from http://pespmc1.vub.ac.be/OCCAMRAZ.html

Occam's Razor

one should not increase, beyond what is necessary, the number of entities required to explain anything


Occam's razor is a logical principle attributed to the mediaeval philosopher William of Occam (or Ockham). The principle states that one should not make more assumptions than the minimum needed. This principle is often called the principle of parsimony. It underlies all scientific modelling and theory building. It admonishes us to choose from a set of otherwise equivalent models of a given phenomenon the simplest one. In any given model, Occam's razor helps us to "shave off" those concepts, variables or constructs that are not really needed to explain the phenomenon. By doing that, developing the model will become much easier, and there is less chance of introducing inconsistencies, ambiguities and redundancies.

Though the principle may seem rather trivial, it is essential for model building because of what is known as the "underdetermination of theories by data". For a given set of observations or data, there is always an infinite number of possible models explaining those same data. This is because a model normally represents an infinite number of possible cases, of which the observed cases are only a finite subset. The non-observed cases are inferred by postulating general rules covering both actual and potential observations.

For example, through two data points in a diagram you can always draw a straight line, and induce that all further observations will lie on that line. However, you could also draw an infinite variety of the most complicated curves passing through those same two points, and these curves would fit the empirical data just as well. Only Occam's razor would in this case guide you in choosing the "straight" (i.e. linear) relation as best candidate model. A similar reasoning can be made for n data points lying in any kind of distribution.

Occam's razor is especially important for universal models such as the ones developed in General Systems Theory, mathematics or philosophy, because there the subject domain is of an unlimited complexity. If one starts with too complicated foundations for a theory that potentially encompasses the universe, the chances of getting any manageable model are very slim indeed. Moreover, the principle is sometimes the only remaining guideline when entering domains of such a high level of abstraction that no concrete tests or observations can decide between rival models. In mathematical modelling of systems, the principle can be made more concrete in the form of the principle of uncertainty maximization: from your data, induce that model which minimizes the number of additional assumptions.

This principle is part of epistemology, and can be motivated by the requirement of maximal simplicity of cognitive models. However, its significance might be extended to metaphysics if it is interpreted as saying that simpler models are more likely to be correct than complex ones, in other words, that "nature" prefers simplicity.

Wednesday, July 06, 2005

preference query

http://www.cse.buffalo.edu/~chomicki/nsf03/prog2003.html

Preference queries involve the use of a number of algebraic preference operators that have simple formal semantics. The most basic of those is winnow which, when applied to a relation, returns the set of the most preferred tuples -- those which are not dominated by any other tuples in the relation. The winnow operator is parameterized by a preference formula. For example, given a suitable formula and a database of books for sale winnow will return all the cheapest ways to purchase every book. Other preference operators include ranking (unbounded iteration of winnow) and preference range selection.

Monday, July 04, 2005

ACID

Following is a definition and a description of each ACID property:

  • Atomic A transaction must execute exactly once and must be atomic—either all of the work is done or none of it is. Operations within a transaction usually share a common intent and are interdependent. By performing only a subset of these operations, the system could compromise the overall intent of the transaction. Atomicity eliminates the chance of processing only a subset of operations.

  • Consistent A transaction must preserve the consistency of data, transforming one consistent state of data into another consistent state of data. Much of the responsibility for maintaining consistency falls to the application developer.

  • Isolated A transaction must be a unit of isolation, which means that concurrent transactions should behave as if each were the only transaction running in the system. Because a high degree of isolation can limit the number of concurrent transactions, some applications reduce the isolation level in exchange for better throughput. See Configuring Transaction Isolation Levels for more information.

  • Durable A transaction must be recoverable and therefore must have durability. If a transaction commits, the system guarantees that its updates can persist even if the computer crashes immediately after the commit. Specialized logging allows the system's restart procedure to complete unfinished operations required by the transaction, making the transaction durable.

Bloomjoin

The main difference from the semijoin is that a bloom filter is generated over the join column of one relation, which is sent to the other site for positive check.

instance optimality

@inproceedings{FaginLN01,
AUTHOR = {R. Fagin and A. Lotem and M. Naor},
TITLE = "Optimal Aggregation Algorithms for Middleware",
booktitle = PODS,
YEAR = "2001",
pages = "102-113",
}

Intuitively, instance optimality corresponds to optimality
in every instance, as opposed to just the worst case or the
average case. There are many algorithms that are optimal
in a worst-case sense, but are not instance optimal. An
example is binary search: in the worst case, binary search
is guaranteed to require no more than log N probes, for N
data items. However, for each instance, a positive answer
can be obtained in one probe, and a negative answer in two
probes.

Sunday, July 03, 2005

A* search algorithm

From Wikipedia, the free encyclopedia

The A* search algorithm (pronounced "A star") is a graph search algorithm that finds a path from a given initial node to a given goal node (or one passing a given goal test). It employs a "heuristic estimate" that ranks each node by an estimate of the best route that goes through that node. It visits the nodes in order of this heuristic estimate. The A* algorithm is therefore an example of best-first search. A* is not guaranteed to perform better than simpler search algorithms. In a maze-like environment, the only way to reach your destination might be to travel south first and eventually turn around. In this case trying nodes closer to your destination first may cost you time.