Page61
thesamelanguage,andthusareavitalresourceinresearchthatseekstoisolate
characteristicfeaturesoftranslation(seebelow).Wellknownmonolingual
comparablecorporaincludeLaviosa’sEnglishComparableCorpus(1998a,
1998b)andtheCorpusofTranslatedFinnish(Mauranen2004).Likewise,the
subcorporainabilingualcorpusmayberelatedthroughsharedvaluesfor
attributessuchasgenre,dateandplaceofpublication,domain,etc.,andthus
combinetoformabilingualcomparablecorpus.TheNewCorpusforIreland
(Kilgarriffetal.2006),designedinthefirstinstanceasaresourceforEnglish–
Irish(Gaelic)lexicography,isonesuchcorpus.Bilingual(ormultilingual)
comparablecorporaaresometimesusedasadatasourceincontrastive
linguistics,andarevaluedpreciselybecausetheyarefreefrom‘various
translationeffects’(AltenbergandGranger2002:8).Theyarenotwithout
problems,however:aswithmonolingualcomparablecorpora,itcanbedifficult
toensurecomparabilitybetweenthesubcorpora(seeBernardiniandZanettin
2004),andsearchingfor‘crosslinguisticequivalents’(AltenbergandGranger
2002:9)isnotstraightforward.Baker(1995:233)hasalsoexpressed
reservationsabouttheirusefulnessintheoreticaltranslationstudies,claimingthat
theiruseisbasedupontheerroneousassumptionthat‘thereisanaturalwayof
sayinganythinginanylanguage,andthatallweneedtodoistofindouthowto
saysomethingnaturallyinlanguageAandlanguageB’.
Thesubcorporainabilingual(ormultilingual)corpusmay,ontheotherhand,be
relatedthroughtranslation,thatis,thecorpusmaycontaintextsinone
language,alongsidetheirtranslationsintoanotherlanguage(orotherlanguages).
Suchcorporaarecommonlyknownasparallelcorpora,althoughtheterm
translationcorpusisalsoused(AltenbergandGranger2002).Parallel
corporaareusuallyaligned(Véronis2000).Thatis,explicitlinksareprovided
betweenunitsofthesourceandtargettexts,usuallyatthesentencelevel.This
enablesbilingualconcordancing,whereasearchforawordinonelanguage
returnsallsentencescontainingthatword,alongwiththeiralignedequivalent
sentencesintheotherlanguage.Parallelcorporaexistforseverallanguagepairs/
groupsoflanguages.Someareabyproductofbilingualormultilingual
parliaments:theEnglish–FrenchHansardsinCanada(ChurchandGale1991)
andthemultilingualEuroparlcorpus(Koehn2005),whichcontainsthe
proceedingsoftheEuropeanParliament,aretwowellknownexamples.Other,
morehandcrafted,parallelcorporaarecreatedspecificallyforuseintranslation
studiesandcontrastivelinguistics,andanumberofvariationsonthebasic
designarepossible:abilingualparallelcorpuscanbeunidirectionalorbi
directional,forinstance.GiventhatbidirectionalcorporasuchastheEnglish–
NorwegianParallelCorpus(Johansson1998)containsourcetexts(or
‘originals’)inbothlanguages,theycanalsobeusedasbilingualcomparable
corpora,provided,ofcourse,thatconditionsofcomparabilityobtain.Other
parallelcorporamaycontain,ontheirtargetsides,twoormoretranslationsinto
thesamelanguageofthesamesourcetext(Winters2005),orprogressive
draftsoftheemergingtargettext(Utka2004).Parallelcorporahavebeenused
intranslationTRAININGANDEDUCATIONtosupportstudentsinfinding
solutionstoproblemsthatcharacteristicallyariseintranslationbutnotother
sortsofwriting(Pearson2003),andinresearchintotranslationSHIFTS
(Munday1998a,2002).Theyhavealsobeenusedfortheextractionofde
factotranslationequivalentsinbilingualterminographyandlexicography
(BowkerandPearson2002:171–2;Teubert2002,2004),andtoprovide
empiricaldataforcorpusbasedMACHINETRANSLATIONsystems
(Hutchins2005a).
Corpusbasedtranslationstudies
MuchearlyworkinCTSsetouttopursuetheresearchagendaputforwardin
Baker’sseminal1993articleandinvestigated,onascalethathadnotbeen
possiblebefore,thoserecurrentfeaturesthatwerethoughttomaketranslation
differentfromothertypesoflanguageproduction.Thesefeatures,alsocalled
UNIVERSALSoftranslation,includedthereportedtendencyoftranslated
textstobemoreexplicit,usemoreconventionalgrammarandlexis,andbe
somehowsimplerthaneithertheirsourcetextsorothertextsinthetarget
language.Muchofthisworkwasconcernedwithoperationalizingabstract
notionslikesimplificationandEXPLICITATION(see,especially,Baker
1996a),andwithinvestigatingthepotentialofthequantitativetechniques