GFS Architecture: Google sorted out the GFS into groups of PCs. A bunch is essentially asystem of PCs. Each group may contains innumerable applications and computer.
The GFS architecture contains 3 different kind of entities they are as listedbelow:Clients, Master servers and chunk servers. There is generally only asingle master, one or numerous customers/clients and numerous chunkserver. Backupmaster is also present in case the primary master fails.
Customer – In theland of GFS the term ” Client” hint to any element that ask for afile. Demand can extend from administering existing documents to making newrecords on the file. Customers can be different PCs or PC application. Thecustomers are really the clients of the GFS. Chunk Servers –ChunkServers are the critical part of GFS architecture, which do most of thehard work and save the information in chunks . Chunks are 64 mb in size that isusually a big size for most of the files. The Chunk Server always sends theinformation straighfoward to the client and does not involve master in the datasend activity.
The master is mostly involved in the control flow activities. Itworks like a traffic router. Master – The masteracts like the planner, all the activities are monitored by the master. Themaster commitment is to coordinate the clusters and modify the operation log.The log is used by the master to note down its action. For troubleshooting, thelogs are very important as it points where the failure and what time thefailure happened.
Master stores the changes in the metadata. Master sendsdelete and create new chunk request to the chunkserver. We can treat the masteras the store house, it stores the mapping details and the namespace of thechunks. The operation log is always monitored, if logs are not getting updated,the system understand that something is wrong an d may be the master is notalive, in this scenario, quickly an assistant server takes its place so thatthe operation does not stop.
The master server uses the metadata to identify whichchunkserver has the information the client is looking for and which chunk hasthe information. Master does the garbage collection of the abandoned chunk. For disaster recoveryand information safety, GFS makes several copies of each chunk and keep them indifferent chunk server.
Each duplicate is known as a replica. Initially the GFSmakes three copy for all chunk, master has the control on the setting to makeextra duplicates of it as and when wanted. The data are not put away in asimilar chunk server, it put away crosswise over various Chunk Server so thaton the off chance that one master server is dead/unresponsive, the other cananswer and give the important data to the customer. This system makes theprocedure continually going. Figure 1 : GFSArchitecture Figure 1 clarifiesthe stream of the data from the customer to the master and after that to thechunkserver. The customer/client goes to the chunk server and ask if it is having theinformation.
The ace sends chunk index, chunkserver consequently sends a pulseand tells the master it is alive, When the master thinks about them, it sendsmetadata to the customer, on the premise the customer isolates the documentinto what he needs, master at that point sends it to the chunkserver. Read Operation: The application sendthe document name, byte range to the GFS client. The customer changes over thebyte counterbalance given by the application into chunk index.
The customer atthat point sends the record name and chunk index to the master. The mastercheck with the chunk server who has the documents. The master thus restores thechunk handle and area of the copy to the customer.
The customer at that pointstraightforwardly contact the chunk server with the chunk handle and byte run.The chunk server at that point exchange the information to the client . Themaster isn’t engaged with the information exchange process so master does notturn into the bottleneck of the procedure. Write operation: Theapplication sends the document name and the information to the customer. Thecustomer sends the document name and chunk index to the master.
The mastercheck with chunk if it has the control,on the off chance that nobody has the control, master doles out the control toa chunk, that turns it into a primary chunk server. The master at that pointsend the file location and chunk handle to the application. The client sends data toevery one of the copies. The essential in the wake of getting the information,sends a positive reaction to the customer. The customer sends the write chargeto the primary. The primary sends the data write summon to the two optionalcopies in serial request. After the information is written, the auxiliaryanswers back to the primary and recognize the write occasion.
The primaryaffirms to the customer that the compose operation is finished. Theread/compose operations should be possible in parallel by the customer. Figure2 Figure 2: Writeinformation stream operation