Saturday, December 20, 2008

File systems and Disk formating.

For all the new mac (and PC users) essential accessories are the external hard drives for back ups and extra storage. One of the most critical questions is how to format the hard drive. In the Utilities folders of your Applications there is a small tool that is called Disk Utility and it is probably the most essential tool you have. It is the tool that will format, clone, make disk images and many other great stuff. The one thing we will focus here is the formatting of hard drives. Once open the utility you will see a screen that shows on the left side the list of the Hard Drives, and the various options namely: First Aid, Erase, RAID, Restore. Select the Erase and you will have the following screen.
And as you can see there there are various options on the format type and the name of course. The two main options are the HFS (Mac OS extended) and FAT32. There two drastically different formats developed by Apple and Microsoft respectively and they are different approaches in handling the different bits of the hard drive. There are 4 different variations of the the Mac OS Extended (case sensitive, journaled and combinations thereof)
File Allocation Table or FAT is a computer file systems architecture originally purchased by and then developed by Bill Gates and Marc McDonald during 1976/1977. It was used through the whole history of Microsoft (MS DOS, Windows) until the Windows Me. The structure of the FAT system is the division of the hardware space to sectors that contain specific number of bits and a number of bits to describe those sectors (12, 16 and 32). The system will allocate a sector or sectors of the drive to each file. Now if since the Hard Drives are using the Binary system 0 and 1 the number of bits that can be allocated to one file is 232=42944967296Bytes=4GB. So there you have the number one issue with FAT32; the maximum file size cannot exceed the 4GB. However, the format is acceptable by any type of operating system MacOS, Windows, Linux/Unix etc. Also one of the major benefits is the least data loss in case of hard drive failure. Since the the damage can affect the a sector or a neighboring group of sectors damages only the file that is occupying them. One of the disadvantages is the fragmentation that occurs to the drive over the repeatedly erasing and writing. This can increase the seek time and can slow down the process. Although there are de-fragmentation tools are not as effective.
Pros: Minimum data lose.
Cons: Maximum file size 4 GB, fragmentation.

Hierarchical File System or HFS was introduced by Apple in September 1985 specifically to support Apple's first hard disk drive for the Macintosh, replacing the Macintosh File System (MFS), the original file system which had been introduced over a year and a half earlier with the first Macintosh computer. According to the HFS the Hard Drive into logical blocks of 512 bytes. Those blocks can be then be allocated (allocation blocks) to various files. The HSF uses 16 bit system to allocate those blocks in binary system so the number of different allocation blocks are 216=65,536. Therefore the limit of 65,535 allocation blocks resulted in files having a "minimum" size equivalent 1/65,535th the size of the disk. So for a 1 GB disk, the allocation block size under HFS is 16 KB, so even a 1 byte file would take up 16 KB of disk space. However, being younger than FAT has a smarter way of handling the folders. It consists of 5 hierarchic blocks:
  • Logical Block 0 and 1 that contain the information for the system start-up
  • Block 2 contains the Master Directory Block (aka MDB). This defines a wide variety of data about the volume itself, for example date & time stamps for when the volume was created, the location of the other volume structures such as the Volume Bitmap or the size of logical structures such as allocation blocks.
  • Logical block 3 is the starting block of the Volume Bitmap, which keeps track of which allocation blocks are in use and which are free.
  • The Extent Overflow File is a B*-tree that contains extra extents that record which allocation blocks are allocated to which files, once the initial three extents in the Catalog File are used up.
  • The Catalog File is another B*-tree that contains records for all the files and directories stored in the volume.
This architecture results in a more effective seeking and faster response and eliminates the need of de-fragmentation. However since one block handles all the files it prevents multitasking and further if the section of the hard drive that contains that information is damaged the hard drive is having a complete failure. To address this issue Apple introduced the HFS Plus that fixed the multitasking but the catastrophic failure remains an issue. In the MacOSX generation system these formats are known as the Mac OS and Mac OS Extended. Although they addressed all the mentioned issues, this file management system is not compatible with any other operating systems.
In November 2002 Apple introduced the Journaling which allows the system to log the changes before they are executed. So although typically deleting a file involves to processes deleting the file entry and then marking the space it is occupying as free, a power failure in any of the steps will results in abnormalities. For example, if the power failure occurs during the first and second step, the file will be erased without being deleted. Journaling will first log those changes and then execute them marking them while it is done. That will ensure that both steps are executed properly and in the right order. With 10.3 it was introduced the case sensitivity that discriminates Names for names or NaMeS.

Pros: No file size limit, 255 characters to name the file and folders, case sensitive, journaling.
Cons: Minimum file 16kB, maximum file limit, lack of cros-platform compatibility, danger of catastrophic data loss.

So when you buy a new drive, think before you start using it. If you plan to store videos that is more than 4 GB (only videos can be single files larger than 4 Gb) then go for Mac OS Extended, Case Sensitive, if you are planning cross platform definitely use FAT32. Especially for the USB drives, since the added bonus that are RAM types of drives, has almost no seek time, regardless of fragmantation. Just keep in mind that the K=1024, M=1024x1024=1,048,576 and G=1,073,741,824 bites. So your 1 TB drive is 1,000,000,000,000 bits is actually 931 GB and not 1000 GB.

No comments:

Post a Comment