icde09-tutorial.handout
15 pages
Español

icde09-tutorial.handout

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
15 pages
Español
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Description

e6eprobability densityprobability densityOutline1.SimilaritySearching2.Distance-basedindexingSimilaritySearching:3.DimensionreductionIndexing,NearestNeighborFinding,Dimensionality4.EmbeddingmethodsReduction,andEmbeddingMethods,forApplicationsin5.NearestneighborfindingMultimediaDatabasesHananSamethjs@cs.umd.eduDepartmentofComputerScienceUniversityofMarylandCollegePark,MD20742,USABasedonjointworkwithGisliR.Hjaltason.CurrentlyaScienceFoundationofIreland(SFI)WaltonFellowattheCentreforGeocomputationattheNationalUniversityofIrelandatMaynooth(NUIM)Copyright2009:HananSametSimilaritySearchingforMultimediaDatabasesApplications–p.1/114Copyright2009:HananSametSimilaritySearchingforMultimediaDatabasesApplications–p.2/114SimilaritySearchingVoronoiDiagramsApparentlystraightforwardsolution:Importanttaskwhentryingtofindpatternsinapplicationsinvolvingmining1.Partitionspaceintoregionswherealldifferenttypesofdatasuchasimages,video,timeseries,textdocuments,pointsintheregionareclosertotheDNAsequences,etc.region’sdatapointthantoanyotherSimilaritysearchingmoduleisacentralcomponentofcontent-baseddatapointretrievalinmultimediadatabases2.LocatetheVoronoiregioncorre-Problem ...

Informations

Publié par
Nombre de lectures 194
Langue Español

Extrait

e
6
e
probability density
probability density
Outline
1.
Similar
ity
Searching
2.
Distance-based
inde
xing
Similarity
Sear
c
hing:
3.
Dimension
reduction
Inde
xing,
Nearest
Neighbor
Finding,
Dimensionality
4.
Embedding
methods
Reduction,
and
Embed
ding
Methods,
f
or
Applications
in
5.
Nearest
neighbor
finding
Multimedia
Databases

Hanan
Samet
hjs@cs.umd.edu
Depar
tment
of
Computer
Science
Univ
ersity
of
Mar
yland
College
P
ar
k,
MD
20742,
USA
Based
on
joint
w
or
k
with
Gisli
R.
Hjaltason.

Currently
a
Science
F
oundation
of
Ireland
(SFI)
W
alton
F
ello
w
at
the
Centre
f
or
Geocomputation
at
the
National
Univ
ersity
of
Ireland
at
Ma
ynooth
(NUIM)
Cop
yr
ight
2009:
Hanan
Samet
Similar
ity
Searching
f
or
Multimedia
Databases
Applications

p
.1/114
Cop
yr
ight
2009:
Hanan
Samet
Similar
ity
Searching
f
or
Multimedia
Databases
Applications

p
.2/114
Similarity
Sear
c
hing
V
or
onoi
Dia
grams
Apparently
str
aightf
orw
ard
solution:
Impor
tant
task
when
tr
ying
to
find
patter
ns
in
applications
in
v
olving
mining
1.
P
ar
tition
space
into
regions
where
all
diff
erent
types
of
data
such
as
images
,
video
,
time
ser
ies
,
te
xt
documents
,
points
in
the
region
are
closer
to
the
DNA
sequences
,
etc.
region’
s
data
point
than
to
an
y
other
Similar
ity
searching
module
is
a
centr
al
component
of
content-based
data
point
retr
ie
v
al
in
m
ultimedia
databases
2.
Locate
the
V
oronoi
region
corre-
Prob
lem:
finding
objects
in
a
data
set
S
that
are
similar
to
a
quer
y
object
q
sponding
to
the
quer
y
point
based
on
some
distance
measure
d
which
is
usually
a
distance
metr
ic
d=
2
Prob
lem:
stor
age
and
constr
uction
cost
f
or
N
d
-dimensional
points
is
(
N
)
Sample
quer
ies:
1.
point:
objects
ha
ving
par
ticular
f
eature
v
alues
Impr
actical
unless
resor
t
to
some
high-dimensional
appro
ximation
of
a
V
oronoi
diag
r
am
(e
.g.,
OS-tree)
which
results
in
appro
ximate
nearest
neighbors
2.
r
ange:
objects
whose
f
eature
v
alues
f
all
within
a
giv
en
r
ange
or
where
the
distance
from
some
quer
y
object
f
alls
into
a
cer
tain
r
ange
Exponential
f
actor
corresponding
to
the
dimension
d
of
the
under
lying
space
in
3.
nearest
neighbor
:
objects
whose
f
eatures
ha
v
e
v
alues
similar
to
those
the
comple
xity
bounds
when
using
appro
ximations
of
V
oronoi
diag
r
ams
(e
.g.,
of
a
giv
en
quer
y
object
or
set
of
quer
y
objects
(
t;

)
-A
VD)
is
shifted
to
be
in
ter
ms
of
the
error
threshold

r
ather
than
in
ter
ms
of
the
n
umber
of
objects
N
in
the
under
lying
space
4.
closest
pairs:
pairs
of
objects
from
the
same
set
or
diff
erent
sets
which
are
sufficiently
similar
to
each
other
(v
ar
iant
of
spatial
join)
d
1
d
1
1.
(1
;

)
-A
VD:
O
(
N
=
)
space
and
O
(log
(
N
=
))
time
f
or
nearest
neigh-
Responses
in
v
ar
iab
ly
use
some
v
ar
iant
of
nearest
neighbor
finding
bor
quer
y
(
d
1)2
2.
(1
=
;

)
-A
VD:
O
(
N
)
space
and
O
(
t
+
log
N
)
time
f
or
nearest
neigh-
bor
quer
y
Cop
yr
ight
2009:
Hanan
Samet
Similar
ity
Searching
f
or
Multimedia
Databases
Applications

p
.3/114
Cop
yr
ight
2009:
Hanan
Samet
Similar
ity
Searching
f
or
Multimedia
Databases
Applications

p
.4/114
Appr
o
ximate
V
or
onoi
Dia
grams
(A
VD)
Appr
o
ximate
V
or
onoi
Dia
grams
(A
VD)
Representations
Example
par
titions
of
space
induced
b
y

neighbor
sets
A
AB
BC
BC
Dar
kness
of
shading
indicates
cardinality
of
nearest
neighbor
sets
with
BCF
white
corresponding
to
1
A
B
A
B
AD
ABE
BEF
CFI
C
C
ADE
ADE
E
AD
CFI
E
E
D
D
D
EF
AEG
EGH
F
F
G
G
GH
I
I
FHI
H
H
DG
G
H
HI
I
EFH
DEG
(

=
0
:
10
)
(

=
0
:
30
)
(

=
0
:
50
)
G
GH
H
HI
P
ar
tition
under
lying
domain
so
that
Allo
w
up
to
t

1
elements
r
(1

ib
i

t
)
of
S
to
be
associated
with
f
or


0
,
e
v
er
y
b
loc
k
b
is
asso-
ciated
with
some
element
r
in
S
each
b
loc
k
b
f
or
a
giv
en

,
where
b
such
that
r
is
an

-nearest
neigh-
b
each
point
in
b
has
one
of
the
r
as
ib
bor
f
or
all
of
the
points
in
b
(e
.g.,
its

-nearest
neighbor
(e
.g.,
(3,0)-
A
VD
or
(1,0.25)-A
VD)
A
VD)
Cop
yr
ight
2009:
Hanan
Samet
Similar
ity
Searching
f
or
Multimedia
Databases
Applications

p
.5/114
Cop
yr
ight
2009:
Hanan
Samet
Similar
ity
Searching
f
or
Multimedia
Databases
Applications

p
.6/114
Pr
ob
lem:
Cur
se
of
Dimensionality
Alternative
Vie
w
of
Cur
se
of
Dimensionality
Number
of
samples
needed
to
estimate
an
arbitr
ar
y
function
with
a
giv
en
Probability
density
function
(analogous
to
histog
r
am)
of
the
distances
of
le
v
el
of
accur
acy
g
ro
ws
e
xponentially
with
the
n
umber
of
v
ar
iab
les
(i.e
.,
the
objects
is
more
concentr
ated
and
has
a
larger
mean
v
alue
dimensions)
that
compr
ise
it
(Bellman)
Implies
similar
ity
search
algor
ithms
need
to
do
more
w
or
k
F
or
similar
ity
searching,
curse
means
that
the
n
umber
of
points
in
the
data
W
orst
case
when
d
(
x;
x
)
=
0
and
d
(
x;
y
)
=
1
f
or
all
y
=
x
set
that
need
to
be
e
xamined
in
der
iving
the
estimate
(

nearest
neighbor)
Impl

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents