Wednesday, December 01, 2004

dev

Question: a piece of XML data probably contains more than one attributes, so does a query, shall I associate them to facilitate the routing protocol?

Compared with SWAM:
1. SWAM CIKM paper does not indicate the sampling method to establish long-range links
2. it is not appropriate in a large dimensionality or dynamic dimensonality
3. I still do not know how SWAM prove the logN by using Jon Kleinberg's distributed algorithm based on small-world theory
4. I got a hole on a tooth, so I should vote for the dental plan :)
5. Curse of dimensionality (Bellman 1961) refers to the exponential growth of hypervolume as a function of dimensionality. -- Answer by Janne Sinkkonen. Our sampling method is not related to "curse of dimensionality" because we sample only once on each dimension and the long-range link establishment is done also only once after the sampling over all dimensions
6. Similarly, Monte Carlo methods randomly select values to create scenarios of a problem. These values are taken from within a fixed range and selected to fit a probability distribution [e.g. bell curve, linear distribution, etc.]. In Monte Carlo simulation, the random selection process is repeated many times to create multiple scenarios. Each time a value is randomly selected, it forms one possible scenario and solution to the problem. Together, these scenarios give a range of possible solutions, some of which are more probable and some less probable. Monte Carlo simulation is advantageous because it is a "brute force" approach that is able to solve problems for which no other solutions exist. Unfortunately, this also means that it is computer intensive and best avoided if simpler solutions are possible. The most appropriate situation to use Monte Carlo methods is when other solutions are too complex or difficult to use.

7. possible extension: selectivity based, routing/sampling for arbitrary multi-dimensional data distribution, load balancing, data placement (for load balancing both for storage and query), information retrieval, which allows false hit so that we can employ broadcast+routing; and the pub/sub system which reverse the query with data; bloomfilter based filtering for routing hint

8. Euclidean space is also a Hilbert space, which is defined on the constraint of norm

0 Comments:

Post a Comment

<< Home