IEEE BYTE VOLUME-3 ISSUE-1 | Page 14

5
Difference​ ​in​ ​software​ ​poses​ ​a​ ​problem​ ​while​ ​processing​ ​transactions​ ​or​ ​queries . ​ ​Sites may​ ​be​ ​unaware​ ​of​ ​each​ ​other​ ​and​ ​end​ ​up​ ​providing​ ​only​ ​limited​ ​facilities​ ​for​ ​cooperation​ ​in transaction​ ​or​ ​query​ ​processing . ​ ​In​ ​heterogeneous​ ​systems , ​ ​different​ ​nodes​ ​may​ ​have​ ​different hardware​ ​ & ​ ​software​ ​and​ ​data​ ​structures . ​ ​Different​ ​computers​ ​and​ ​operating​ ​systems , ​ ​database applications​ ​or​ ​data​ ​models​ ​may​ ​be​ ​used​ ​at​ ​each​ ​of​ ​the​ ​locations . ​ ​DBMS​ ​is​ ​required​ ​to​ ​allow communication​ ​between​ ​different​ ​sites . ​ ​The​ ​users​ ​must​ ​be​ ​able​ ​to​ ​make​ ​requests​ ​in​ ​a​ ​database language​ ​at​ ​their​ ​local​ ​sites . ​ ​However , ​ ​there​ ​may​ ​be​ ​compatibility​ ​issues​ ​among​ ​these​ ​different structures . ​ ​The​ ​heterogeneous​ ​system​ ​is​ ​often​ ​not​ ​technically​ ​or​ ​economically​ ​feasible . ​ ​A​ ​user at​ ​one​ ​location​ ​may​ ​be​ ​able​ ​to​ ​read​ ​but​ ​not​ ​update​ ​the​ ​data​ ​at​ ​another​ ​location . ​ ​There​ ​are​ ​two principal​ ​approaches​ ​to​ ​store​ ​a​ ​relation​ ​in​ ​a​ ​distributed​ ​database​ ​system .
A ) ​ ​Replication , ​ ​wherein​ ​the​ ​system​ ​maintains​ ​several​ ​replicas​ ​of​ ​the​ ​same​ ​relation​ ​at​ ​different sites . B ) ​ ​Fragmentation , ​ ​wherein​ ​the​ ​relation​ ​is​ ​fragmented​ ​into​ ​several​ ​relations​ ​in​ ​such​ ​a​ ​way​ ​that the​ ​actual​ ​relation​ ​could​ ​be​ ​reconstructed​ ​from​ ​the​ ​fragments . ​ ​The​ ​fragments​ ​are​ ​then distributed​ ​to​ ​different​ ​locations .
There​ ​are​ ​several​ ​reasons​ ​to​ ​use​ ​distributed​ ​database​ ​systems .
● It​ ​helps​ ​in​ ​handling​ ​of​ ​the​ ​distributed​ ​data​ ​with​ ​different​ ​levels​ ​of​ ​transparency
● It​ ​increases​ ​the​ ​reliability​ ​and​ ​availability​ ​of​ ​data
● It​ ​provides​ ​local​ ​or​ ​site​ ​autonomy​ ​which​ ​in​ ​turn​ ​leads​ ​to​ ​efficient​ ​data​ ​management
● If​ ​there​ ​were​ ​ever​ ​a​ ​catastrophic​ ​event​ ​such​ ​as​ ​a​ ​fire , ​ ​all​ ​of​ ​the​ ​data​ ​would​ ​not​ ​be​ ​in​ ​one place , ​ ​but​ ​distributed​ ​in​ ​multiple​ ​locations . ​ ​This​ ​protects​ ​large​ ​chunks​ ​of​ ​important​ ​data .
● ​ ​Data​ ​is​ ​located​ ​near​ ​the​ ​site​ ​of​ ​greatest​ ​demand , ​ ​allowing​ ​load​ ​on​ ​the​ ​databases​ ​to​ ​be balanced​ ​among​ ​servers .
● Systems​ ​can​ ​be​ ​modified , ​ ​added​ ​to , ​ ​and​ ​removed​ ​from​ ​the​ ​distributed​ ​database​ ​systems without​ ​affecting​ ​other​ ​nodes​ ​or​ ​data
● There​ ​is​ ​continuous​ ​operation​ ​of​ ​the​ ​system​ ​even​ ​if​ ​some​ ​nodes​ ​are​ ​offline​ ​ / ​ ​not​ ​working
However , ​ ​there​ ​is​ ​reason​ ​for​ ​caution​ ​as​ ​to​ ​the​ ​following​ ​reasons​ ​ :
● ​ ​More​ ​extensive​ ​infrastructure​ ​means​ ​extra​ ​labour​ ​costs
● Remote​ ​database​ ​fragments​ ​must​ ​be​ ​secured​ ​for​ ​data​ ​protection
● In​ ​a​ ​distributed​ ​database , ​ ​enforcing​ ​integrity​ ​over​ ​a​ ​network​ ​may​ ​require​ ​too​ ​much​ ​of​ ​the network ' s​ ​resources
● In​ ​addition​ ​to​ ​the​ ​challenges​ ​posed​ ​by​ ​traditional​ ​database​ ​design , ​ ​the​ ​design​ ​of​ ​a distributed​ ​database​ ​has​ ​to​ ​consider​ ​fragmentation​ ​of​ ​data , ​ ​scattering​ ​of​ ​fragments​ ​to specific​ ​sites​ ​and​ ​data​ ​replication
● There​ ​may​ ​be​ ​concurrence​ ​between​ ​the​ ​operation​ ​of​ ​two​ ​or​ ​more​ ​different​ ​sites​ ​or nodes . ​ ​This​ ​poses​ ​a​ ​major​ ​problem​ ​to​ ​the​ ​system​ ​operation​ ​as​ ​a​ ​whole . ​ ​It​ ​can​ ​be​ ​solved by​ ​locking​ ​and​ ​timestamping .