Automatic Data Mining on Internet by Using PERL Scripting Language 66-9
pages.may.not.be.fully.depictive.of.the.templates.of.all.other.pages..Poor.performance.of.learned.rules.is.
exp
erienced
.for.pag
es
.tha
t
.fol
low
.tem
plates
.unc
overed
.by.the.lab
eled
.pa
ges.
.is.pro
blem
.can.be.sol
ved
.
by.lab
eling
.mor
e
.pag
es,
.bec
ause
.mor
e
.pag
es
.cov
er
.mor
e
.tem
plates.
.How
ever,
.man
ual
.lab
eling
.req
uiring
.
a.la
rge
.su
pply
.of.la
bor
.an
d
.is.ti
me
.co
nsuming
.wi
th
.an.un
satised
.co
verage
.of.al
l
.po
ssible
.te
mplates.
is
.met
hod
.of.dat
a
.ext
raction,
.DMR
,
.opti
mizes
.the.dat
a
.min
ing
.pro
cess
.and.mak
es
.it.bec
ome
.a.
pop
ular
.tool.in.ext
racting
.dat
a
.fro
m
.web.pag
es.
.PER
L
.scr
ipt
.wit
h
.reg
ular
.exp
ressions
.and.mod
ules
.
inc
reases
.the.spe
ed
.of.dat
a
.ext
racting
.as.wel
l
.as.the.acc
uracy.
.e.DMR.is.cus
tomized
.acc
ording
.to.the.
req
uired
.da
ta
.an
d
.th
e
.fo
rmat
.of.da
ta
.th
at
.us
ers
.de
sire.
references
[CKGS06].C.H..Chang,.M..Kayed,.R..Girgis,.and.K.F..Shaalan,.A.survey.of.web.information.extrac-
tion
. sys
tems,
. IEE
E Transactions on Knowledge and Data Engineering,
. 18(
10),
. 141
1–1428,
.
Oct
ober
.200
6.
[H99]
.S..Ho
lzner,
.PERL B
lack Book,
.Co
riolisOpen
.Pr
ess,
.Sco
ttsdale,
.AZ,.Edi
tion
.2004.
[KCSG07]
.M..Key
ed,
.C.-H..Cha
ng,
.K..Shaa
lan,
.and.M.R..Girg
is,
.FiVa.tec
h:
.Pag
e-level
.dat
a
.extract
ion
.
fro
m
.temp
late
.pag
es,
.Sev
enth IEEE International Conference on Data Mining Workshops 2007 (ICDM
Workshops 2007),
.Oma
ha,
.NE,.pp
.
.15–20,.Oc
tober
.28–31,.2007.
[NGW08]
.S..Nee
li,
.K..Gov
indasamy,
.B.M..Wil
amowski,
.and.A..Malin
owski,
.Aut
o
.dat
a
.mining.fro
m
.web.
serv
ers
.using.PERL.scri
pt,
.Int
ernational Conference on Intelligent Engineering System 2008
.(INES
2008),
.Mi
ami,
.FL,.pp.191–196,.Fe
bruary
.25–29,.2008.
[PW09]
.N..Pha
m
.and.B.M..Wil
amowski,
.IEEE.art
icle
.dat
a
.extract
ion
.fro
m
.Int
ernet,
.13th IEEE
Intelligent
Engineering Systems Conference
.(INES 2009
),
.Ba
rbados,
.Ap
ril
.16–18,.2009.
[PYW10]
.N..Pha
m,
.H..Yu,.and.B.M..Wil
amowski,
.Neural.netw
ork
.train
er
.thro
ugh
.comp
uter
.netw
orks,
.
24t
h IEEE International Conference on Advanced Information Networking and Applications 2010,
.
Per
th,
.Aus
tralia,
.pp..1203–1209,.2010.
[W10]
. B.M..Wil
amowski,
. Design. of. netw
ork
. bas
ed
. so
ware,
. 24th IEEE
International Conference on
Advanced Information Networking and Applications 2010,
.Pe
rth,
.Aus
tralia,
.pp..4–10,.2010.
[WSC05]
.I.-C..Wu,.J.-Y..Su,.and.L.-B..Chen,.A.web.dat
a
.extract
ion
.descri
ption
.lan
guage
.and.its.imple-
m
entation,
.29th A
nnual International Conference on Computer Soware and Applications Conference
2005
.(COMPSAC 2005),.Vol..2,.Edinburgh,.U.K.,.pp..293–298,.July.25–28,.2005.
© 2011 by Taylor and Francis Group, LLC