The Power of Sampling in Knowledge Discovery.
Jyrki Kivinen, Heikki Mannila:
The Power of Sampling in Knowledge Discovery.
PODS 1994: 77-85@inproceedings{DBLP:conf/pods/KivinenM94,
author = {Jyrki Kivinen and
Heikki Mannila},
title = {The Power of Sampling in Knowledge Discovery},
booktitle = {Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium
on Principles of Database Systems, May 24-26, 1994, Minneapolis,
Minnesota},
publisher = {ACM Press},
year = {1994},
isbn = {0-89791-642-5},
pages = {77-85},
ee = {http://doi.acm.org/10.1145/182591.182601, db/conf/pods/pods94-77.html},
crossref = {DBLP:conf/pods/94},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
We consider the problem of approximately verifying the truth of
sentences of tuple relational calculus in a given relation M by
considering only a random sample of M. We define two different
measures for the error of a universal sentence in a relation. For a
set of n universal sentences each with at most k universal
quantifiers, we give upper and lower bounds for the sample sizes
required for having a high probability that all the sentences with
error at least epsilon can be detected as false by considering the
sample. The sample sizes are O((log n)/epsilon) or
O((|M|(1-1/k) log n)/epsilon), depending on the error measure used.
We also consider universal-existential sentences.
Copyright © 1994 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98.
and ...
Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings.
and ...
Printed Edition
Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 24-26, 1994, Minneapolis, Minnesota.
ACM Press 1994, ISBN 0-89791-642-5
Contents
[Abstract and Index Terms]
[Full Text in PDF Format, 939 KB]
References
- [Ch93]
- William W. Cohen:
Pac-Learning a Restricted Class of Recursive Logic Programs.
AAAI 1993: 86-92

- [Fag82]
- Ronald Fagin:
Horn clauses and database dependencies.
J. ACM 29(4): 952-985(1982)

- [GHK92]
- Adam J. Grove, Joseph Y. Halpern, Daphne Koller:
Asymptotic Conditional Probabilities for First-Order Logic.
STOC 1992: 294-305

- [HKM94]
- ...
- [HO91]
- Wen-Chi Hou, Gultekin Özsoyoglu:
Statistical Estimators for Aggregate Relational Algebra Queries.
ACM Trans. Database Syst. 16(4): 600-654(1991)

- [HS92]
- Peter J. Haas, Arun N. Swami:
Sequential Sampling Procedures for Query Size Estimation.
SIGMOD Conference 1992: 341-350

- [KI91]
- Ravi Krishnamurthy, Tomasz Imielinski:
Research Directions in Knowledge Discovery.
SIGMOD Record 20(3): 76-78(1991)

- [KM92]
- Jyrki Kivinen, Heikki Mannila:
Approximate Dependency Inference from Relations.
ICDT 1992: 86-98

- [LNS90]
- Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider:
Practical Selectivity Estimation through Adaptive Sampling.
SIGMOD Conference 1990: 1-11

- [Mug92]
- ...
- [NS90]
- Jeffrey F. Naughton, S. Seshadri:
On Estimating the Size of Projections.
ICDT 1990: 499-513

- [Nii87]
- ...
- [Odd86]
- ...
- [OSW89]
- ...
- [PS91]
- Gregory Piatetsky-Shapiro:
Discovery, Analysis, and Presentation of Strong Rules.
Knowledge Discovery in Databases 1991: 229-248

- [PSF91]
- Gregory Piatetsky-Shapiro, William J. Frawley (Eds.):
Knowledge Discovery in Databases.
AAAI/MIT Press 1991, ISBN 0-262-62080-4
Contents

- [Roy90]
- Shaibal Roy:
Semantic Complexity of Classes of Relational Queries and Query Independent Data Partitioning.
PODS 1991: 259-267

- [Ull88]
- Jeffrey D. Ullman:
Principles of Database and Knowledge-Base Systems, Volume I.
Computer Science Press 1988, ISBN 0-7167-8158-1
Contents

- [Val84]
- Leslie G. Valiant:
A Theory of the Learnable.
Commun. ACM 27(11): 1134-1142(1984)

- [ZZ93]
- ...
Referenced by
- Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik:
The New Jersey Data Reduction Report.
IEEE Data Eng. Bull. 20(4): 3-45(1997)
- Heikki Mannila:
Methods and Problems in Data Mining.
ICDT 1997: 41-55
- Hannu Toivonen:
Sampling Large Databases for Association Rules.
VLDB 1996: 134-145
Copyright © Mon Nov 2 21:04:24 2009
by Michael Ley (ley@uni-trier.de)