file systems. The structure of the file system. FAT file system architecture Fat ntfs tables serve to

This article is about file systems . When installing Windows, it prompts you to select the file system on the partition where it will be installed, and PC users must choose from two options FAT or NTFS.

In most cases, users are content knowing that NTFS is "better" and choose this option.

However, sometimes they wonder and what exactly is better?

In this article, I will try to explain what is a file system, what they are, how they differ, and which one should be used.

The article simplified some of the technical features of file systems for a more understandable perception of the material.

File system is a way of organizing data on storage media. The file system determines where and how files will be written to the media and provides the operating system with access to those files.

Additional requirements are imposed on modern file systems: the ability to encrypt files, access control for files, and additional attributes. Usually the file system is written at the beginning of the hard disk. ().

From the point of view of the OS, a hard drive is a set of clusters.

cluster is an area of a disk of a certain size for storing data. The minimum cluster size is 512 bytes. Since the binary number system is used, the sizes of clusters are a multiple of a power of two.

The user can figuratively imagine a hard drive as a checkered notepad. One cell on the page is one cluster. The file system is the content of the notepad, and the file is the word.

For hard drives in a PC, two file systems are currently the most common: FAT or NTFS. First appeared FAT (FAT16), then FAT32, and then NTFS.

FAT(FAT16) – is an abbreviation for File Allocation Table(in translation File Allocation Table).

The FAT structure was developed by Bill Gates and Mark MacDonald in 1977. It was used as the main file system in DOS and Microsoft Windows operating systems (up to Windows ME version).

There are four versions of FAT - FAT12, FAT16, FAT32 And exFAT. They differ in the number of bits allocated to store the cluster number.

FAT12 mainly used for floppy disks, FAT16- for small disks, and the new exFAT mainly for flash drives. The maximum cluster size supported by FAT is 64Kb. ()

FAT16 first introduced in November 1987. Index 16 in the name indicates that 16 bits are used for the cluster number. As a result, the maximum size of a disk partition (volume) that this system can support is 4GB.

Later, with the development of technology and the advent of disks with a capacity of more than 4 GB, a file system appeared. FAT32. It uses 32-bit cluster addressing and was introduced with Windows 95 OSR2 in August 1996. FAT32 limited in volume size to 128GB. Also this system can support long filenames. ().

NTFS(abbreviation NewtechnologyfileSystem - New Technology File System) is the standard file system for the Microsoft Windows NT family of operating systems.

Introduced July 27, 1993 with Windows NT 3.1. NTFS is based on the HPFS file system (abbreviation highPerformancefileSystem - High Performance File System), which was created by Microsoft together with IBM for the OS / 2 operating system.

Main features of NTFS: built-in capabilities to restrict access to data for different users and user groups, as well as assign quotas (restrictions on the maximum amount of disk space occupied by certain users), the use of a journaling system to improve the reliability of the file system.

The file system specifications are closed. Usually the cluster size is 4Kb. In practice, it is not recommended to create volumes larger than 2TB. Hard drives have just reached this size, perhaps in the future we will have a new file system. ().

During the installation of Windows XP, it is prompted to format the disk in the system FAT or NTFS. This means FAT32.

All file systems are built on the principle: one cluster - one file. Those. one cluster stores the data of only one file.

The main difference for the average user between these systems is the size of the cluster. “A long time ago, when disks were small and files were very small,” it was very noticeable.

Consider the example of one volume on a 120GB disk and a 10Kb file.

For FAT32 the cluster size will be 32Kb, and for NTFS- 4Kb.

IN FAT32 such a file will occupy 1 cluster, leaving 32-10=22Kb of unallocated space.

IN NTFS such a file will take up 3 clusters, leaving 12-10=2Kb of unallocated space.

By analogy with a notebook, a cluster is a cell. And having put a dot in a cell, we already logically occupy it all, but in reality there is a lot of free space.

Thus, the transition from FAT32 To NTFS allows more optimal use of the hard disk when there are a large number of small files in the system.

In 2003, I had a 120GB drive divided into 40 and 80GB volumes. When I switched from Windows 98 to Windows XP and converted the drive from FAT32 V NTFS, I got about 1GB of free disk space. At that time it was a significant "increase".

To find out what file system is used on the hard disk volumes of your PC, you need to open the volume properties window and on the tab "Are common" read this data.

Volume- this is a synonym for a disk partition, users usually call the volume “drive C”, “drive D”, etc. An example is shown in the picture below:

Currently, disks with a capacity of 320 GB or more are widely used. Therefore, I recommend using the system NTFS for optimal use of disk space.

Also, if there are several users on a PC, NTFS allows you to configure file access in such a way that different users cannot read and modify files of other users.

In organizations, when working on a local network, system administrators use other features of NTFS.

If you are interested in organizing access to files for several users on one PC, then the following articles will describe this in detail.

When writing the article, materials from the sites en.wikipedia.org were used

Article author: Maxim Telpari
PC user with 15 years of experience. Support specialist of the video course "Confident PC user", after studying which you will learn how to assemble a computer, install Windows XP and drivers, restore the system, work in programs and much more.

Make money on this article!
Sign up for an affiliate program. Replace the course link in the article with your affiliate link. Add an article to your site. You can get a reprint version.

Material for review lecture No. 33

for students of the specialty

"Information Technology Software"

Associate Professor of the Department of ICT, Ph.D. Livak E.N.

FILE MANAGEMENT SYSTEMS

Basic concepts, facts

Appointment. Features of filesystemsfat,VFAT,FAT32,hpfs,NTFS. File systems OS UNIX (s5, ufs), OS Linux Ext2FS. System areas of the disk (partition, volume). Principles of file placement and storage of information about the location of files. Directory organization. Restricting access to files and directories.

Skills

Using knowledge about the structure of the file system to protect and restore computer information (files and directories). Organization of access control to files.

file systems. File system structure

Data is stored on disk as files. A file is a named part of a disk.

File management systems are designed to manage files.

The ability to deal with data stored in files at the logical level is provided by the file system. It is the file system that determines how data is organized on a storage medium.

Thus, file system is a set of specifications and their associated software that are responsible for creating, destroying, organizing, reading, writing, modifying and moving file information, as well as controlling access to files and managing the resources used by files.

The file management system is the main subsystem in the vast majority of modern operating systems.

Using the file management system

· all system processing programs are connected according to the data;

· the problems of centralized distribution of disk space and data management are solved;

· the user is provided with the opportunity to perform operations on files (create, etc.), to exchange data between files and various devices, to protect files from unauthorized access.

Some operating systems may have multiple file management systems, allowing them to work with multiple file systems.

Let's try to distinguish between the file system and the file management system.

The term "file system" defines the principles for accessing data organized in files.

Term "file management system" refers to a particular implementation of the filesystem, i.e. this is a set of software modules that provide work with files in a specific OS.

So, to work with files organized in accordance with some file system, an appropriate file management system must be developed for each OS. This UV system will only work on the OS for which it was created.

For the Windows OS family, file systems are mainly used: VFAT, FAT 32, NTFS.

Consider the structure of these file systems.

On the file system FAT The disk space of any logical drive is divided into two areas:

system area and

the data area.

System area is created and initialized when formatting, and subsequently updated when manipulating the file structure.

The system area consists of the following components:

A boot sector containing a boot record (boot record);

Reserved sectors (they may not be);

file allocation tables (FAT, File Allocation Table);

Root directory (ROOT).

These components are located on the disk one after another.

Data area contains files and directories subordinate to the root.

The data area is divided into so-called clusters. A cluster is one or more contiguous sectors of a data area. On the other hand, a cluster is the smallest addressable unit of disk space allocated to a file. Those. a file or directory occupies an integer number of clusters. To create and write a new file to disk, the operating system allocates several free disk clusters for it. These clusters do not have to follow each other. For each file, a list of all cluster numbers that are provided to this file is stored.

Splitting a data area into clusters instead of using sectors allows you to:

· reduce the size of the FAT table ;

Reduce file fragmentation

Reduces the length of file chains Þ speed up file access.

However, a cluster size that is too large leads to inefficient use of the data area, especially in the case of a large number of small files (after all, an average of half a cluster is lost for each file).

In modern file systems (FAT 32, HPFS , NTFS ) this problem is solved by limiting the cluster size (maximum 4 KB)

The data area map is T file allocation table (File Allocation Table - FAT) Each element of the FAT table (12, 16 or 32 bits) corresponds to one disk cluster and characterizes its state: free, busy or is a bad cluster.

· If the cluster is allocated to any file (ie, busy), then the corresponding FAT element contains the number of the next file cluster;

· the last cluster of the file is marked with a number in the range FF8h - FFFh (FFF8h - FFFFh);

· if the cluster is free, it contains the zero value 000h (0000h);

· A cluster that is unusable (failed) is marked with the number FF7h (FFF7h).

Thus, in the FAT table, clusters belonging to the same file are linked in chains.

The file allocation table is stored immediately after the boot record of the logical disk, its exact location is described in a special field in the boot sector.

It is stored in two identical copies that follow each other. When the first copy of the table is destroyed, the second is used.

Due to the fact that FAT is used very heavily when accessing a disk, it is usually loaded into the RAM (into the I / O buffer or cache) and remains there for as long as possible.

The main disadvantage of FAT is slow file handling. When creating a file, the rule works - the first free cluster is selected. This leads to disk fragmentation and complex file chains. Hence the slowdown in working with files.

To view and edit the FAT table, you can use utilitydiskEditor.

Detailed information about the file itself is stored in another structure called the root directory. Each logical disk has its own root directory (ROOT, English - root).

Root directory describes files and other directories. The directory element is a file descriptor (descriptor).

The descriptor of each file and directory includes it

· Name

· extension

date of creation or last modification

time of creation or last modification

attributes (archive, directory attribute, volume attribute, system, hidden, read-only)

file length (for a directory - 0)

a reserved field that is not used

· the number of the first cluster in the chain of clusters assigned to the file or directory; having received this number, the operating system, referring to the FAT table, will also find out all the other numbers of file clusters.

So, the user launches the file for execution. The operating system looks for a file with the desired name by looking at file descriptions in the current directory. When the required element is found in the current directory, the operating system reads the number of the first cluster of this file, and then determines the remaining cluster numbers from the FAT table. Data from these clusters is read into RAM, combined into one continuous section. The operating system transfers control to the file, and the program starts running.

To view and edit the ROOT directory, you can also use utilitydiskEditor.

File system VFAT

The VFAT (virtual FAT) file system first appeared in Windows for Workgroups 3.11 and was designed for file I/O in protected mode.

This file system is used in Windows 95.

It is also supported in Windows NT 4.

VFAT is Windows 95's "native" 32-bit file system. It is controlled by the VFAT .VXD driver.

VFAT uses 32-bit code for all file operations and can use 32-bit protected mode drivers.

BUT, the file allocation table entries remain 12- or 16-bit, so the same data structure (FAT) is used on the disk. Those. f table formatVFAT is the same, just like the FAT format.

VFAT along with "8.3" names supports long filenames. (VFAT is often said to be FAT with support for long names).

The main disadvantage of VFAT is the large losses for clustering with large logical disk sizes and restrictions on the size of the logical disk itself.

File system FAT 32

This is a new implementation of the idea of using the FAT table.

FAT 32 is a completely independent 32-bit file system.

First used in Windows OSR 2 (OEM Service Release 2).

FAT 32 is currently used in Windows 98 and Windows ME.

It contains numerous improvements and additions over previous FAT implementations.

1. Much more efficient use of disk space due to the fact that it uses smaller clusters (4 KB) - it is estimated that it saves up to 15%.

2. Has an extended boot record that allows you to create copies of critical data structures Þ increases the resistance of the disk to violations of the disk structures

3. Can use a FAT backup instead of a standard one.

4. Can move the root directory, in other words, the root directory can be in an arbitrary location Þ removes the limit on the size of the root directory (512 elements, since ROOT was supposed to occupy one cluster).

5. Improved root directory structure

Additional fields appeared, for example, creation time, creation date, last access date, checksum

There are still multiple descriptors for a long filename.

File system HPFS

HPFS (High Performance File System) is a high performance file system.

HPFS first appeared in OS/2 1.2 and LAN Manager.

Let's list main features of HPFS.

· The main difference is the basic principles of placing files on a disk and the principles of storing information about the location of files. Thanks to these principles, HPFS has high performance and fault tolerance, is reliable file system.

Disk space in HPFS is allocated not by clusters (as in FAT), but blocks. In the modern implementation, the block size is taken equal to one sector, but in principle it could be of a different size. (In fact, a block is a cluster, only a cluster is always equal to one sector). Arranging files in such small blocks allows use disk space more efficiently, since the free space overhead averages only (half a sector) 256 bytes per file. Recall that the larger the cluster size, the more disk space is wasted.

The HPFS system seeks to arrange the file in contiguous blocks, or, if this is not possible, to place it on disk in such a way that extents(fragments) of the file were physically as close to each other as possible. This approach is essential reduces the positioning time of the write/read heads hard disk drive and latency (delay between read/write head position on the correct track). Recall that in a FAT file, the first free cluster is simply allocated.

Extents(extent) - file fragments located in adjacent disk sectors. A file has at least one extent if it is not fragmented, and more than one extent otherwise.

Used method balanced binary trees for storing and searching for information about the location of files (directories are stored in the center of the disk, in addition, automatic sorting of directories is provided), which is essential improves productivity HPFS (versus FAT).

HPFS provides special extended file attributes that allow manage access to files and directories.

Extended Attributes (extended attributes , EAs ) allow you to store additional information about the file. For example, each file can be associated with its unique graphic image (icon), file description, comment, information about the file owner, etc.

C HPFS Partition Structure

At the beginning of the partition with HPFS installed, there are three control block:

boot block (boot block),

additional block (super block) and

Spare (backup) block (spare block).

They occupy 18 sectors.

All other disk space in HPFS is divided into parts from adjacent sectors - stripes(band - strip, tape). Each stripe occupies 8 MB on disk.

Each lane and has its own sector allocation bitmap.The bitmap shows which sectors of a given band are occupied and which are free. Each sector of the data strip corresponds to one bit in its bitmap. If bit = 1, then the sector is busy, if 0 - free.

The bitmaps of the two bands are located side by side on the disk, as are the bands themselves. That is, the sequence of stripes and cards looks like in Fig.

Compare withFAT. There is only one "bitmap" for the entire disk (FAT table). And to work with it, you have to move the read / write heads on average through half the disk.

It is in order to reduce the positioning time of the read / write heads of the hard disk that the HPFS disk is divided into stripes.

Consider control blocks.

Boot block (bootblock)

Contains the name of the volume, its serial number, the BIOS settings block, and the boot program.

Bootstrap finds file OS 2 LDR , reads it into memory and transfers control to this OS boot program, which in turn loads the OS/2 kernel from disk into memory - OS 2 KRNL. And already OS 2 KRIML using information from a file CONFIG. SYS loads all other necessary program modules and data blocks into memory.

The boot block is located in sectors 0 to 15.

SuperBlock(super block)

Contains

A pointer to a list of bitmaps ( bitmap block list ). This list lists all the blocks on the disk that contain the bitmaps used to detect free sectors;

pointer to the list of bad blocks (bad block list). When the system detects a damaged block, it is added to this list and is no longer used for information storage;

a pointer to a group of directories (directory band ),

pointer to the file node (F -node ) of the root directory,

· the date of the last check of the partition by the CHKDSK program;

information about the stripe size (in the current implementation of HPFS - 8 MB).

Super block is placed in the 16th sector.

Spareblock(spare block)

Contains

pointer to the emergency replacement map (hotfix map or hotfix -areas );

· pointer to the list of free spare blocks (directory emergency free block list);

a number of system flags and descriptors.

This block is located in sector 17 of the disk.

The spare block provides high fault tolerance of the HPFS file system and allows you to recover damaged data on the disk.

The principle of file placement

Extents(extent) - file fragments located in adjacent disk sectors. A file has at least one extent if it is not fragmented, and more than one extent otherwise.

To reduce the positioning time of the read / write heads of the hard disk, the HPFS system seeks

1) place the file in adjacent blocks;

2) if this is not possible, then place the extents of the fragmented file as close to each other as possible,

To do this, HPFS uses statistics, and also tries to conditionally reserve at least 4 kilobytes of space at the end of files that grow.

Principles of storing information about the location of files

Each file and directory on the disk has its own F-Node file node. This is a structure that contains information about the location of the file and its extended attributes.

Each F-Node occupies one sector and is always located near its file or directory (usually just before the file or directory). The F-Node contains

length,

the first 15 characters of the file name,

Special service information

File access statistics

Extended file attributes

a list of access rights (or only a part of this list, if it is very large); if the extended attributes are too large for the file node, then a pointer to them is written to it.

associative information about the location and subordination of the file, etc.

If the file is continuous, then its location on the disk is described by two 32-bit numbers. The first number is a pointer to the first block of the file, and the second is the extent length (the number of consecutive blocks that belong to the file).

If the file is fragmented, then the location of its extents is described in the file node with additional pairs of 32-bit numbers.

A file node can contain information about up to eight extents of a file. If a file has more extents, then a pointer to an allocation block is written to its file node, which can contain up to 40 pointers to extents or, by analogy with a directory tree block, to other allocation blocks.

Structure and placement of directories

Used to store directories. stripe in the center of the disc.

This strip is called directoryband.

If it is full, HPFS starts placing file directories in other lanes.

The location of this information structure in the middle of the disk significantly reduces the average positioning time of the read/write heads.

However, a significantly greater (compared with the placement of the Directory Band in the middle of a logical disk) contribution to HPFS performance comes from using method balanced binary trees for storing and retrieving information about the location of files.

Recall that in the file system FAT the directory has a linear structure, not specially ordered, so when searching for a file, you need to sequentially look through it from the very beginning.

In HPFS, the directory structure is a balanced tree with entries in alphabetical order.

Each entry in the tree contains

file attributes,

a pointer to the corresponding file node,

information about the time and date of creation of the file, the time and date of the last update and access,

length of data containing extended attributes,

file access counter

The length of the file name

the name itself

and other information.

When searching for a file in a directory, the HPFS file system only looks at the necessary branches of the binary tree. This method is many times more efficient than sequential reading of all entries in the directory, which is the case in the FAT system.

The size of each of the blocks in terms of which directories are allocated in the current implementation of HPFS is 2 KB. The size of a record describing a file depends on the size of the file name. If the name is 13 bytes (for 8.3 format), then a 2K block can hold up to 40 file descriptors. The blocks are linked to each other by means of a list.

Problems

When renaming files, a so-called rebalancing of the tree can occur. Creating a file, renaming or erasing it may result in cascading directory blocks. In fact, renaming may fail due to lack of disk space, even if the file itself has not grown in size. To avoid this disaster, HPFS maintains a small pool of free blocks that can be used in the event of a disaster. This operation may require allocation of additional blocks on a full disk. A pointer to this pool of free blocks is stored in a SpareBlock ,

How files and directories are placed on diskHPFS:

· information about the location of the files is dispersed throughout the disk, while the records of each specific file are placed (if possible) in adjacent sectors and close to the data on their location;

directories are placed in the middle of disk space;

· directories are stored as a binary balanced tree with entries arranged in alphabetical order.

Reliability of data storage in HPFS

Any file system must have the means to correct errors that occur when information is written to disk. The HPFS system uses emergency replacement mechanism ( hotfix ).

If the HPFS file system encounters a problem while writing data to disk, it displays an appropriate error message. HPFS then stores the information that should have been written to the bad sector in one of the spare sectors reserved in advance for this case. The list of free spare blocks is stored in the HPFS spare block. If an error is detected while writing data to a normal block, HPFS selects one of the free spare blocks and stores the data in it. The file system then updates emergency replacement card in the reserve unit.

This map is simply pairs of double words, each of which is a 32-bit sector number.

The first number indicates the defective sector, and the second - the sector among the available spare sectors, which was chosen to replace it.

After the bad sector is replaced with a spare, the replacement card is written to disk, and a pop-up window appears on the screen informing the user that a disk write error has occurred. Every time the system writes or reads a disk sector, it looks at the hot spare map and replaces all bad sector numbers with spare sector numbers with the corresponding data.

It should be noted that this number translation does not significantly affect the performance of the system, since it is performed only when physically accessing the disk, but not when reading data from the disk cache.

File system NTFS

The NTFS (New Technology File System) file system contains a number of significant improvements and changes that significantly distinguish it from other file systems.

Note that, with rare exceptions, NTFS partitions can only be accessed directly fromWindowsNT, although there are corresponding implementations of file management systems for a number of operating systems for reading files from NTFS volumes.

However, there are no full-fledged implementations for working with NTFS outside of Windows NT yet.

NTFS is not supported on the widely used Windows 98 and Windows Millennium Edition operating systems.

Key FeaturesNTFS

work on large disks is efficient (much more efficient than in FAT);

There are tools to restrict access to files and directories Þ NTFS partitions provide local security for both files and directories;

A transaction mechanism has been introduced, in which logging file operations Þ significant increase in reliability;

· many restrictions on the maximum number of disk sectors and/or clusters have been removed;

· a file name in NTFS, unlike the FAT and HPFS file systems, can contain any characters, including the full set of national alphabets, since the data is presented in Unicode - a 16-bit representation, which gives 65535 different characters. The maximum length of a filename in NTFS is 255 characters.

· NTFS also has built-in compression tools that you can apply to individual files, entire directories, and even volumes (and subsequently cancel or assign them at your discretion).

Volume structure with NTFS file system

An NTFS partition is called a volume. The maximum possible volume sizes (and file sizes) are 16 Ebytes (2 exabytes**64).

Like other systems, NTFS divides a volume's disk space into clusters, blocks of data that are addressed as units of data. NTFS supports cluster sizes from 512 bytes to 64 KB; the standard is a cluster of 2 or 4 KB.

All disk space in NTFS is divided into two unequal parts.

The first 12% of the disk is reserved for the so-called MFT zone - the space that can be occupied, increasing in size, by the main service metafile MFT.

It is not possible to write any data to this area. The MFT zone is always kept empty - this is done so that the MFT file, if possible, does not become fragmented as it grows.

The remaining 88% of the volume is ordinary file storage space.

MFT (masterfiletable- general file table) is essentially a directory of all other files on the disk, including itself. It is designed to determine the location of files.

The MFT consists of fixed size records. The size of the MFT entry (minimum 1 KB and maximum 4 KB) is determined during volume formatting.

Each entry corresponds to a file.

The first 16 entries are service in nature and are not available to the operating system - they are called metafiles, and the very first metafile is the MFT itself.

These first 16 MFT elements are the only part of the disk that has a strictly fixed position. A copy of the same 16 records is kept in the middle of the volume for security.

The remaining parts of the MFT file can be located, like any other file, in arbitrary places on the disk.

Metafiles are service in nature - each of them is responsible for some aspect of the system. Metafiles are located in the root directory of an NTFS volume. They all begin with the name character "$", although it is difficult to get any information about them using standard tools. In table. the main metafiles and their purpose are given.

Metafile name	Purpose of the metafile
$MFT	The Master File Table itself
$MFTmirr	A copy of the first 16 MFT records placed in the middle of the volume
$logfile	Logging Support File
$Volume	Service information - volume label, file system version, etc.
$AttrDef	List of standard file attributes on a volume
		Root directory
$Bitmap		Volume free space map
$boot		Boot sector (if the partition is bootable)
$Quota		A file that records user rights to use disk space (this file only started working in Windows 2000 with NTFS 5.0)
$upcase		File - a table of correspondence between upper and lower case letters in file names. In NTFS, filenames are written in Unicode (which is 65 thousand different characters) and looking for large and small equivalents in this case is a non-trivial task

The corresponding MFT record stores all information about the file:

· file name,

· size;

file attributes

position on the disk of individual fragments, etc.

If one MFT record is missing for the information, then several records are used, and not necessarily in a row.

If the file is not very large, then the file data is stored directly in the MFT, in the space remaining from the main data, within one MFT record.

A file on an NTFS volume is identified by a so-called file link(File Reference ), which is represented as a 64-bit number.

the file number that corresponds to the record number in the MFT,

and sequence numbers. This number is incremented whenever the given number is reused in the MFT, allowing the NTFS file system to perform internal integrity checks.

Each file in NTFS is represented by flows(streams ), that is, it does not have “just data” as such, but there are streams.

One of the streams is the file's data.

Most file attributes are also streams.

Thus, it turns out that the file has only one basic entity - the number in the MFT, and everything else, including its streams, is optional.

This approach can be effectively used - for example, another stream can be “sticked” to a file by writing any data to it.

The standard attributes for files and directories on an NTFS volume have fixed names and type codes.

Catalog NTFS is a special file that stores links to other files and directories.

The catalog file is divided into blocks, each of which contains

· file name,

basic attributes and

The root directory of a disk is no different from ordinary directories, except for a special link to it from the beginning of the MFT metafile.

The internal directory structure is a binary tree like in HPFS.

The number of files in the root and non-root directories is unlimited.

The NTFS file system supports the NT security object model: NTFS treats directories and files as heterogeneous objects and maintains separate (though overlapping) lists of permissions for each type.

NTFS provides file-level security; this means that access rights to volumes, directories, and files may depend on the user account and the groups to which the user belongs. Each time a user accesses a file system object, their permissions are checked against the object's permission list. If the user has a sufficient level of rights, his request is granted; otherwise, the request is rejected. This security model applies to both local user login on NT machines and remote network requests.

NTFS also has some self-healing features. NTFS supports various mechanisms for checking system integrity, including transaction logging, which allows you to replay file write operations against a special system log.

At journaling file operations, the file management system records the changes taking place in a special service file. At the beginning of the operation associated with changing the file structure, a corresponding mark is made. If any failure occurs during operations on files, then the mentioned operation start mark remains indicated as incomplete. If you perform a file system integrity check after the machine is rebooted, these pending operations will be undone and the files will be restored to their original state. If the operation of changing data in files is completed normally, then the operation is marked as completed in this very service file of logging support.

The main disadvantage of the file systemNTFS- service data takes up a lot of space (for example, each element of the directory takes up 2 KB) - for small partitions, service data can take up to 25% of the media volume.

Þ NTFS cannot be used to format floppy disks. Do not use it to format partitions smaller than 100 MB.

OS file system UNIX

In the UNIX world, there are several different kinds of file systems with their own external memory structure. The best known are the traditional UNIX System V (s5) file system and the UNIX BSD family (ufs) file system.

Consider s 5.

A UNIX file is a set of random access characters.

The file has a structure that the user imposes on it.

The Unix file system is a hierarchical, multi-user file system.

The file system has a tree structure. The vertices (intermediate nodes) of the tree are directories with links to other directories or files. The leaves of the tree correspond to files or empty directories.

Comment. In fact, the Unix file system is not a tree. The fact is that the system has the possibility of breaking the hierarchy in the form of a tree, since it is possible to associate multiple names with the same file content.

Disk structure

The disk is divided into blocks. The data block size is determined when the file system is formatted with the mkfs command and can be set to 512, 1024, 2048, 4096, or 8192 bytes.

We count by 512 bytes (sector size).

The disk space is divided into the following areas (see figure):

loading block;

control superblock;

array of i-nodes;

area for storing the contents (data) of files;

a set of free blocks (linked in a list);

Bootstrap block

Superblock

i - node

. . .

i - node

Comment. For the UFS file system - all this is repeated for a group of cylinders (except for the Boot block) + a special area is allocated for describing a group of cylinders

Bootstrap block

The block is located in block #0. (Recall that the location of this block in system device block zero is determined by the hardware, since the hardware loader always refers to system device block zero. This is the last component of the file system that depends on the hardware.)

The boot block contains a spinup program that is used to initially start the UNIX OS. In s 5 file systems, only the boot block of the root file system is actually used. In secondary file systems, this area is present but not used.

Superblock

It contains operational information about the state of the file system, as well as data about file system settings.

Specifically, the superblock contains the following information

the number of i-nodes (index descriptors);

partition size???;

list of free blocks;

list of free i-nodes;

· and other.

Let's pay attention! Free disk space is linked list of free blocks. This list is stored in the superblock.

The elements of the list are arrays of 50 elements (if block = 512 bytes, then element = 16 bits):

· Array elements Nos. 1-48 contain the numbers of free blocks of the space of file blocks from 2 to 49.

element #0 contains a pointer to the continuation of the list, and

· the last element (#49) contains a pointer to a free element in the array.

If some process needs a free block for the file extension, then the system selects an element of the array by the pointer (to the free element), and the block with the number stored in this element is provided to the file. If the file is reduced, then the released numbers are added to the array of free blocks and the pointer to the free element is adjusted.

Since the array size is 50 elements, two critical situations are possible:

1. When we release blocks of files, but they cannot fit in this array. In this case, one free block is selected from the file system and the fully filled array of free blocks is copied into this block, after which the value of the pointer to the free element is set to zero, and in the zero element of the array, which is in the superblock, the number of the block that the system has chosen to copy the contents of the array is written to. At this point, a new element of the list of free blocks is created (each with 50 elements).

2. When the contents of the array elements of free blocks are exhausted (in this case, the zero element of the array is equal to zero) If this element is not equal to zero, then this means that there is a continuation of the array. This continuation is read into a copy of the superblock in RAM.

List of freei-nodes. This is a buffer consisting of 100 elements. It contains information about 100 numbers of i-nodes that are free at the moment.

Superblock is always in RAM

Þ all operations (release and occupation of blocks and i-nodes occur in RAM Þ minimizing disk exchanges.

But! If the contents of the superblock are not written to disk and the power is turned off, then problems will arise (a discrepancy between the real state of the file system and the contents of the superblock). But this is already a requirement for the reliability of the system equipment.

Comment. UFS file systems support multiple copies of the superblock (one copy per cylinder group) for increased resiliency

Inode area

This is an array of file descriptions called i-nodes (i-node).(64 bytes?)

Each index descriptor (i-node) of a file contains:

File type (file/directory/special file/fifo/socket)

Attributes (permissions) - 10

File owner ID

The ID of the group that owns the file

File creation time

File modification time

The last time the file was accessed

File length

The number of links to the given i-node from different directories

Addresses of file blocks

!note. There is no file name here

Let's take a closer look at how it's organized. block addressing, which contains the file. So, in the field with addresses are the numbers of the first 10 blocks of the file.

If the file exceeds ten blocks, then the following mechanism starts to work: the 11th element of the field contains the block number, which contains 128 (256) links to the blocks of the given file. In the event that the file is even larger, then the 12th element of the field is used - it contains the block number, which contains 128 (256) block numbers, where each block contains 128 (256) block numbers of the file system. And if the file is even larger, then the 13th element is used - where the nesting depth of the list is increased by one more.

Thus, we can get a file of size (10+128+128 2 +128 3)*512.

This can be represented in the following form:

Address of the 1st file block

Address of the 2nd file block

Address of the 10th file block

Indirect block address (block with 256 block addresses)

Block address of 2nd indirect addressing (block with 256 address blocks with addresses)

Block address of the 3rd indirect addressing (block with block addresses with block addresses with addresses)

File protection

Now let's look at the owner and group IDs and security bits.

The Unix OS uses three-level hierarchy of users:

The first level is all users.

The second level is user groups. (All users are divided into groups.

The third level is a specific user (Groups consist of real users). Due to this three-level organization of users, each file has three attributes:

1) The owner of the file. This attribute is associated with one particular user, who is automatically assigned by the system as the owner of the file. You can become the default owner by creating a file, and there is also a command that allows you to change the owner of a file.

2) File access protection. Access to each file is restricted in three categories:

owner rights (what the owner can do with this file, in the general case - not necessarily anything);

the rights of the group to which the owner of the file belongs. The owner is not included here (for example, a file can be read-locked for the owner, and all other members of the group can freely read from this file;

all other users of the system;

According to these three categories, three actions are regulated: reading from a file, writing to a file, and executing a file (in the system mnemonics R, W, X, respectively). In each file, these three categories define which user can read, which write, and who can run it as a process.

Catalog organization

The directory from the point of view of the OS is a regular file that contains data about all the files that belong to the directory.

The directory element consists of two fields:

1) the number of the i-node (serial number in the array of i-nodes) and

2)file name:

Each directory contains two special names: '.' - the directory itself; ‘..’ is the parent directory.

(For the root directory, the parent refers to itself.)

In general, a directory can have entries referring to the same i-node more than once, but a directory cannot have entries with the same name. That is, an arbitrary number of names can be associated with the contents of a file. It is called binding. A directory entry that refers to a single file is called communication.

Files exist independently of directory entries, and directory links actually point to physical files. A file "disappears" when the last link pointing to it is removed.

So, to access a file by name, operating system

1. finds this name in the directory containing the file,

2. gets the i-node number of the file,

3. by number finds i-node in the area of i-nodes,

4. from the i-node receives the addresses of the blocks in which the file data is located,

5. reads blocks from the data area by block addresses.

Disk partition structure in EXT2 FS

The entire partition space is divided into blocks. A block can be 1, 2, or 4 kilobytes in size. A block is an addressable unit of disk space.

Blocks in their area are combined into groups of blocks. Groups of blocks in a file system and blocks within a group are numbered sequentially starting from 1. The first block on a disk is numbered 1 and belongs to group number 1. The total number of blocks on a disk (in a disk partition) is a divisor of the disk size expressed in sectors. And the number of block groups does not have to divide the number of blocks, because the last group of blocks may not be complete. The beginning of each group of blocks has an address, which can be obtained as ((group number - 1)* (number of blocks in the group)).

Each group of blocks has the same structure. Its structure is presented in the table.

The first element of this structure (superblock) is the same for all groups, and all the rest are individual for each group. The superblock is stored in the first block of each block group (with the exception of group 1, which has a boot record in the first block). Superblock is the starting point of the file system. It has a size of 1024 bytes and is always located at offset 1024 bytes from the beginning of the file system. The presence of several copies of the superblock is explained by the extreme importance of this element of the file system. Superblock duplicates are used when recovering a file system after crashes.

The information stored in the superblock is used to organize access to the rest of the data on the disk. The superblock determines the size of the file system, the maximum number of files in the partition, the amount of free space, and contains information about where to look for unallocated areas. When the OS starts, the superblock is read into memory, and all changes to the file system are first reflected in the copy of the superblock located in the operating system, and are written to disk only periodically. This improves system performance because many users and processes are constantly updating files. On the other hand, when you turn off the system, the superblock must be written to disk, which does not allow you to turn off the computer by simply turning off the power. Otherwise, at the next boot, the information written in the superblock will not correspond to the real state of the file system.

Following the superblock is the description of the group of blocks (Group Descriptors). This description contains:

The address of the block containing the block bitmap of the given group;

Address of the block containing the inode bitmap of the given group;

The address of the block containing the inode table of this group;

Counter of the number of free blocks in this group;

The number of free inodes in this group;

Number of inodes in this group that are directories

and other data.

The information stored in the group description is used to find the block and inode bitmaps and the inode table.

File system Ext 2 is characterized by:

hierarchical structure,
coordinated processing of data arrays,
dynamic file extension,
protection of information in files,
treating peripherals (such as terminals and tape drives) as files.

Internal representation of files

Each file in an Ext 2 system has a unique index. The index contains the information that any process needs to access the file. Processes access files using a well-defined set of system calls and identify the file with a character string that acts as the pathname of the file. Each compound name uniquely identifies a file, due to which the kernel of the system converts this name into a file index. The index includes a table of addresses where the file information is located on the disk. Since each block on the disk is addressed by its number, this table stores a collection of disk block numbers. To increase flexibility, the kernel appends one block at a time to a file, allowing the file's information to be scattered throughout the file system. But such a layout complicates the task of finding data. The address table contains a list of block numbers containing information belonging to the file.

File inodes

Each file on the disk has a corresponding file inode, which is identified by its ordinal number - the file's index. This means that the number of files that can be created in the file system is limited by the number of inodes, which is either explicitly set when the file system is created or calculated from the physical size of the disk partition. Inodes exist in static form on disk, and the kernel reads them into memory before working with them.

The file inode contains the following information:

- The type and permissions of this file.

File owner ID (Owner Uid).

File size in bytes.

The time of the last access to the file (Access time).

File creation time.

The time the file was last modified.

File deletion time.

Group ID (GID).

Links count .

The number of blocks occupied by the file.

File flags

Reserved for OS

Pointers to blocks in which file data is written (an example of direct and indirect addressing in Fig. 1)

File version (for NFS)

File ACL

directory ACL

Fragment address

Fragment number

Fragment size

Catalogs

Directories are files.

The kernel stores data in a directory just as it does in a regular file type, using an index structure and blocks with direct and indirect address levels. Processes can read data from directories in the same way they read regular files, however, exclusive write access to the directory is reserved by the kernel to ensure that the directory structure is correct.).

When a process uses a file path, the kernel searches the directories for the corresponding inode number. After the filename has been converted to an inode number, that inode is placed in memory and then used in subsequent requests.

Additional features of EXT2 FS

In addition to the standard Unix features, EXT2fs provides some additional features not normally supported by Unix filesystems.

File attributes allow you to change how the kernel reacts when working with sets of files. You can set attributes on a file or directory. In the second case, files created in this directory will inherit these attributes.

During system mount, some file attributes related features can be set. The mount option allows the administrator to choose how files are created. On a BSD-specific file system, files are created with the same group ID as the parent directory. The features of System V are somewhat more complex. If a directory's setgid bit is set, then created files inherit the directory's group ID, and subdirectories inherit the group ID and setgid bit. Otherwise, files and directories are created with the primary group ID of the calling process.

The EXT2fs system can use synchronous data modification similar to the BSD system. The mount option allows the administrator to specify that all data (index descriptors, bit blocks, indirect blocks, and directory blocks) be written to disk synchronously when they are modified. This can be used to achieve high write throughput, but also results in poor performance. In fact, this function is not usually used, because in addition to degrading performance, it can lead to the loss of user data that is not marked when checking the file system.

EXT2fs allows you to choose the size of the logical block when creating a file system. It can be 1024, 2048 or 4096 bytes in size. The use of large blocks leads to faster I/O operations (because the number of requests to the disk is reduced), and, consequently, to less movement of heads. On the other hand, the use of large blocks leads to a loss of disk space. Usually the last block of a file is not fully used for storing information, so with an increase in the size of the block, the amount of wasted disk space increases.

EXT2fs allows you to use accelerated symbolic links. When such links are used, file system data blocks are not used. The name of the destination file is not stored in the data block, but in the inode itself. This structure allows you to save disk space and speed up the processing of symbolic links. Of course, the space reserved for the handle is limited, so not every link can be represented as an accelerated link. The maximum length of a file name in an accelerated link is 60 characters. In the near future, it is planned to expand this scheme for small files.

EXT2fs monitors the state of the file system. The kernel uses a separate field in the superblock to indicate the state of the file system. If the file system is mounted in read/write mode, then its state is set to "Not Clean". If it is unmounted or remounted in read-only mode, then its state is set to "Clean". During system boot and file system health checks, this information is used to determine if a file system check is needed. The kernel also places some errors in this field. When the kernel detects an inconsistency, the filesystem is marked "Erroneous". The file system checker tests this information to check the system, even if its state is actually "Clean".

Ignoring file system testing for a long time can sometimes lead to some difficulties, so EXT2fs includes two methods for regularly checking the system. The superblock contains a system mount counter. This counter is incremented each time the system is mounted in read/write mode. If its value reaches the maximum value (it is also stored in the super block), then the file system test routine runs a file system check, even if its state is "Clean". The last check time and the maximum interval between checks are also stored in the superblock. When the maximum interval between checks is reached, the state of the file system is ignored and its check is started.

Performance optimization

The EXT2fs system contains many features that optimize its performance, which leads to an increase in the speed of information exchange when reading and writing files.

EXT2fs makes heavy use of the disk buffer. When a block needs to be read, the kernel issues an I/O request to several contiguous blocks. Thus, the kernel tries to make sure that the next block to be read has already been loaded into the disk buffer. Such operations are usually performed when sequentially reading files.

The EXT2fs system also contains a large number of information layout optimizations. Block groups are used to group corresponding inodes and data blocks. The kernel always tries to put the data blocks of one file in the same group, as well as its descriptor. This is intended to reduce the movement of the drive heads when reading the descriptor and its corresponding data blocks.

When writing data to a file, EXT2fs preallocates up to 8 contiguous blocks when placing a new block. This method allows you to achieve high performance with a heavy system load. It also allows contiguous blocks for files to be allocated, which speeds up their subsequent reading.

FAT file systems

FAT16

The FAT16 file system predates MS-DOS and is supported by all Microsoft operating systems for compatibility. Its name File Allocation Table (file location table) perfectly reflects the physical organization of the file system, the main characteristics of which include the fact that the maximum size of a supported volume (hard disk or partition on a hard disk) does not exceed 4095 MB. In the days of MS-DOS, 4 GB hard drives seemed like an impossible dream (20-40 MB drives were a luxury), so such a reserve was quite justified.

A volume formatted to use FAT16 is divided into clusters. The default cluster size depends on the size of the volume and can range from 512 bytes to 64 KB. In table. Figure 2 shows how the cluster size depends on the volume size. Note that the cluster size may differ from the default value, but must have one of the values specified in Table 1. 2.

It is not recommended to use the FAT16 file system on volumes larger than 511 MB, since disk space will be used extremely inefficiently for relatively small files (a 1-byte file will take 64 KB). Regardless of the cluster size, the FAT16 file system is not supported for volumes larger than 4 GB.

FAT32

Starting with Microsoft Windows 95 OEM Service Release 2 (OSR2), Windows introduced support for 32-bit FAT. For Windows NT-based systems, this file system was first supported in Microsoft Windows 2000. While FAT16 can support volumes up to 4 GB, FAT32 can support volumes up to 2 TB. The cluster size in FAT32 can vary from 1 (512 bytes) to 64 sectors (32 KB). FAT32 cluster values require 4 bytes to store (32 bits, not 16 as in FAT16). This means, in particular, that some file utilities designed for FAT16 cannot work with FAT32.

The main difference between FAT32 and FAT16 is that the size of the disk logical partition has changed. FAT32 supports volumes up to 127 GB. At the same time, if when using FAT16 with 2 GB disks, a 32 KB cluster was required, then in FAT32 a 4 KB cluster is suitable for disks from 512 MB to 8 GB (Table 4).

This accordingly means more efficient use of disk space - the smaller the cluster, the less space is required to store the file and, as a result, the disk becomes less fragmented.

When using FAT32, the maximum file size can be up to 4 GB minus 2 bytes. If when using FAT16 the maximum number of entries in the root directory was limited to 512, then FAT32 allows you to increase this number to 65,535.

FAT32 imposes restrictions on the minimum volume size - it must be at least 65,527 clusters. At the same time, the cluster size cannot be such that the FAT occupies more than 16 MB - 64 KB / 4 or 4 million clusters.

When using long filenames, the data required for access from FAT16 and FAT32 does not overlap. When a file is created with a long filename, Windows creates the corresponding 8.3 format name and one or more directory entries to store the long name (13 characters from the long filename per entry). Each subsequent occurrence stores the corresponding part of the filename in Unicode format. Such entries have the attributes "volume id", "read-only", "system", and "hidden", a set that is ignored by MS-DOS; on this operating system, a file is accessed by its "alias" in 8.3 format.

NTFS file system

Microsoft Windows 2000 includes support for a new version of the NTFS file system, which, in particular, provides work with Active Directory directory services, reparse points, information security tools, access control, and a number of other features.

As with FAT, the basic unit of information in NTFS is the cluster. In table. Figure 5 shows the default cluster sizes for volumes of various sizes.

When you create an NTFS file system, the formatter creates a Master File Table (MTF) file and other areas for storing metadata. Metadata is used by NTFS to implement the file structure. The first 16 entries in the MFT are reserved by NTFS itself. The location of the metadata files $Mft and $MftMirr is recorded in the boot sector of the disk. If the first entry in the MFT is corrupted, NTFS reads the second entry to find a copy of the first. A complete copy of the boot sector is located at the end of the volume. In table. 6 lists the main metadata stored in the MFT.

The remaining MFT entries contain entries for each file and directory located on the volume.

Typically, one file uses one entry in the MFT, but if the file has a large set of attributes or becomes too fragmented, additional entries may be required to store information about it. In this case, the first record about the file, called the base record, stores the location of the other records. Data about files and directories of small size (up to 1500 bytes) is completely contained in the first entry.

File attributes in NTFS

Each occupied sector on an NTFS volume belongs to a particular file. Even the file system metadata is part of the file. NTFS treats each file (or directory) as a set of file attributes. Elements such as the file name, its security information, and even the data in it are attributes of the file. Each attribute is identified by a specific type code and, optionally, by an attribute name.

If the attributes of a file fit within a file record, they are called resident attributes. These attributes are always the name of the file and the date it was created. In cases where the information about a file is too large to fit into a single MFT record, some of the file's attributes become non-resident. Resident attributes are stored in one or more clusters and represent a stream of alternate data for the current volume (more on that below). To describe the location of resident and non-resident attributes, NTFS creates an Attribute List attribute.

In table. 7 shows the main file attributes defined in NTFS. This list may be expanded in the future.

CDFS file system

Windows 2000 provides support for the CDFS file system, which conforms to the ISO'9660 standard, which describes the location of information on a CD-ROM. Long filenames are supported according to ISO'9660 Level 2.

When creating a CD-ROM for use with Windows 2000, keep the following in mind:

all directory and file names must be less than 32 characters;
all directory and file names must contain only uppercase characters;
the depth of directories should not exceed 8 levels from the root;
the use of filename extensions is optional.

Comparison of file systems

Under Microsoft Windows 2000, FAT16, FAT32, NTFS, or combinations of these file systems can be used. The choice of operating system depends on the following criteria:

how the computer is used;
hardware platform;
size and number of hard drives;
information security

FAT file systems

As you may have noticed, the numbers in the names of the file systems - FAT16 and FAT32 - indicate the number of bits required to store information about the cluster numbers used by the file. So, FAT16 uses 16-bit addressing and, accordingly, it is possible to use up to 216 addresses. In Windows 2000, the first four bits of the FAT32 file location table are needed for internal use, so FAT32 reaches 228 addresses.

In table. 8 shows cluster sizes for FAT16 and FAT32 file systems.

In addition to significant differences in cluster size, FAT32 also allows the root directory to expand (in FAT16, the number of entries is limited to 512 and can be even lower when using long filenames).

Benefits of FAT16

Among the advantages of FAT16 are the following:

the file system is supported by MS-DOS, Windows 95, Windows 98, Windows NT, Windows 2000, and some UNIX operating systems;
there are a large number of programs that allow you to correct errors in this file system and recover data;
if there are problems with booting from the hard disk, the system can be booted from the floppy disk;
this file system is quite efficient for volumes smaller than 256 MB.

Disadvantages of FAT16

The main disadvantages of FAT16 include:

the root directory cannot contain more than 512 entries. Using long filenames greatly reduces the number of these elements;
FAT16 supports a maximum of 65,536 clusters, and since some clusters are reserved by the operating system, the number of available clusters is 65,524. Each cluster has a fixed size for a given LUN. When the maximum number of clusters is reached at their maximum size (32 KB), the maximum supported volume is limited to 4 GB (under Windows 2000). To maintain compatibility with MS-DOS, Windows 95, and Windows 98, the size of a FAT16 volume must not exceed 2 GB;
FAT16 does not support built-in file protection and compression;
on large disks, a lot of space is wasted due to the fact that the maximum cluster size is used. The space for the file is allocated based on the size of the cluster, not the file.

Benefits of FAT32

Among the advantages of FAT32 are the following:

disk space allocation is performed more efficiently, especially for large disks;
the root directory in FAT32 is a regular chain of clusters and can be located anywhere on the disk. Because of this, FAT32 does not impose any restrictions on the number of items in the root directory;
due to the use of smaller clusters (4 KB on disks up to 8 GB), the occupied disk space is usually 10-15% less than under FAT16;
FAT32 is the more secure file system. In particular, it supports the ability to move the root directory and use a FAT backup. In addition, the boot record contains a number of critical data for the file system.

Disadvantages of FAT32

The main disadvantages of FAT32:

the volume size when using FAT32 under Windows 2000 is limited to 32 GB;
FAT32 volumes are not available from other operating systems - only from Windows 95 OSR2 and Windows 98;
boot sector backup is not supported;
FAT32 does not support built-in file protection and compression.

NTFS file system

When using Windows 2000, Microsoft recommends that you format all hard disk partitions to NTFS, except for configurations where multiple operating systems are used (except Windows 2000 and Windows NT). Using NTFS instead of FAT allows you to use the features available in NTFS. These include, in particular:

the possibility of recovery. This feature is "built into" the file system. NTFS guarantees the safety of data due to the fact that it uses a protocol and some information recovery algorithms. In the event of a system failure, NTFS uses the protocol and additional information to automatically restore the integrity of the file system;
information compression. For NTFS volumes, Windows 2000 supports single file compression. Such compressed files can be used by Windows applications without prior decompression, which occurs automatically when reading from the file. When closing and saving the file is packed again;
In addition, the following advantages of NTFS can be distinguished:

Some operating system features require NTFS;

Access speed is much faster - NTFS minimizes the number of disk accesses required to find a file;

Protection of files and directories. Only on NTFS volumes it is possible to set file and folder access attributes;

When using NTFS, Windows 2000 supports volumes up to 2TB;

The file system maintains a backup copy of the boot sector - it is located at the end of the volume;

NTFS supports the Encrypted File System (EFS) encryption system, which provides protection against unauthorized access to the contents of files;

When using quotas, you can limit the amount of disk space used by users.

Disadvantages of NTFS

Speaking about the shortcomings of the NTFS file system, it should be noted that:

NTFS volumes are not available on MS-DOS, Windows 95, and Windows 98. In addition, a number of features that are available in NTFS under Windows 2000 are not available on Windows 4.0 and earlier;
Small volumes containing many small files may experience performance degradation compared to FAT.

File system and speed

As we have already found out, for small volumes, FAT16 or FAT32 provides faster file access compared to NTFS, because:

FAT has a simpler structure;
directories are smaller;
FAT does not support protecting files from unauthorized access - the system does not need to check file permissions.

NTFS minimizes the number of disk accesses and the time it takes to find a file. Also, if the directory size is small enough to fit in a single MFT entry, the entire entry is read in one go.

One entry in the FAT contains the cluster number for the first cluster in the directory. Viewing a FAT file requires searching through the entire file structure.

When comparing the speed of operations performed for directories containing short and long file names, it should be taken into account that the speed of operations for FAT depends on the operation itself and the size of the directory. If FAT looks for a file that doesn't exist, it searches the entire directory, an operation that takes longer than searching the B-tree structure used by NTFS. The average time it takes to find a file in FAT is expressed as a function of N/2, in NTFS it is expressed as log N, where N is the number of files.

A number of the following factors affect the speed of reading and writing files under Windows 2000:

file fragmentation. If the file is highly fragmented, NTFS usually requires fewer disk accesses than FAT to find all the fragments;
cluster size. For both file systems, the default cluster size depends on the size of the volume and is always expressed as a power of 2. Addresses in FAT16 are 16-bit, in FAT32 they are 32-bit, in NTFS they are 64-bit;
the default cluster size in FAT is based on the fact that the file location table can have no more than 65,535 entries - the cluster size is a function of the volume size divided by 65,535. Thus, the default cluster size for a FAT volume is always larger than than the cluster size for an NTFS volume of the same size. Note that a larger cluster size for FAT volumes means that FAT volumes can be less fragmented;
location of small files. When using NTFS, small files are contained in an MFT record. The size of a file that fits into a single MFT record depends on the number of attributes in that file.

Maximum size of NTFS volumes

Theoretically, NTFS supports volumes with up to 232 clusters. But nevertheless, in addition to the lack of hard drives of this size, there are other restrictions on the maximum size of the volume.

One such limitation is the partition table. Industry standards limit the size of the partition table 2 to 32 sectors. Another limitation is the sector size, which is typically 512 bytes. Since the sector size may change in the future, the current size limits the size of a single volume to 2 TB (2 32 x 512 bytes = 2 41). Thus, 2TB is the practical limit for NTFS physical and logical volumes.

In table. Figure 11 shows the main limitations of NTFS.

Managing access to files and directories

When using NTFS volumes, you can set file and directory permissions. These access rights specify which users and groups have access to them and what level of access is allowed. Such access rights apply both to users working on the computer on which the files are located, and to users accessing files over the network when the file is located in a directory open for remote access.

Under NTFS, you can also set remote access permissions combined with file and directory permissions. In addition, file attributes (read-only, hidden, system) also restrict access to the file.

Under FAT16 and FAT32, it is also possible to set file attributes, but they do not provide file permissions.

The version of NTFS used in Windows 2000 introduced a new type of access permission called inherited permissions. The Security tab contains the option Allow inheritable permissions from parent to propagate to this file object, which is active by default. This option significantly reduces the time required to change the permissions for files and subdirectories. For example, to change the permissions of a tree containing hundreds of subdirectories and files, it is enough to enable this option - in Windows NT 4, you must change the attributes of each individual file and subdirectory.

On fig. Figure 5 shows the Properties dialog box and the Security tab (Advanced section) listing extended file permissions.

Recall that for FAT volumes, access can only be controlled at the volume level, and such control is possible only with remote access.

Compressing files and directories

Windows 2000 supports compression of files and directories located on NTFS volumes. Compressed files are readable and writable by any Windows application. For this, there is no need for their preliminary unpacking. The compression algorithm used is similar to that used in DoubleSpace (MS-DOS 6.0) and DriveSpace (MS-DOS 6.22), but has one significant difference - under MS-DOS, an entire primary partition or logical device is compressed, while under NTFS you can pack individual files and directories.

The compression algorithm in NTFS is designed to support clusters up to 4 KB in size. If the cluster size is larger than 4 KB, the NTFS compression features become unavailable.

Self-healing NTFS

The NTFS file system is self-healing and can maintain its integrity through the use of a log of actions taken and a number of other mechanisms.

NTFS treats every operation that modifies system files on NTFS volumes as a transaction and stores information about such a transaction in a log. A started transaction can either be completely completed (commit) or rolled back (rollback). In the latter case, the NTFS volume returns to the state prior to the start of the transaction. In order to manage transactions, NTFS writes all the operations involved in a transaction to a log file before it is written to disk. After the transaction is completed, all operations are performed. Thus, under NTFS management, there can be no pending operations. In the event of disk failures, pending operations are simply cancelled.

Under the control of NTFS, operations are also performed that allow you to identify bad clusters on the fly and allocate new clusters for file operations. This mechanism is called cluster remapping.

In this review, we examined the various file systems supported in Microsoft Windows 2000, discussed the design of each of them, noted their advantages and disadvantages. The most promising is the NTFS file system, which has a large set of features that are not available in other file systems. The new version of NTFS supported by Microsoft Windows 2000 has even more functionality and is therefore recommended for use when installing the Win 2000 operating system.

ComputerPress 7"2000

Many users are faced with a misunderstanding of the basics of how Windows file systems work. It would seem, why an unnecessary theory? In fact, it is the knowledge of the deep functioning of various file systems that allows you to correctly choose one or another file system for one or another storage medium. Sometimes an error in the choice can become critical later when solving the problem of information recovery or premature wear of the media.

The file system consists of a file management system and a collection of files on a certain type of media (CD, DVD, FDD, HDD, Flash, etc.). A file management system provides users and applications with the ability to access files, store them, and maintain the integrity of their content. The most common long-term storage medium in modern computing systems is the hard drive - "Winchester". This term applies to any sealed disc with aerodynamically designed magnetic read heads.

The file systems of modern operating systems are installed on hard disk partitions.

FAT 32. Simplicity and reliability.

There are three FAT file systems: FAT12 (for FDD floppy disks), FAT16, FAT32. They differ in the number of bits (12, 16, 32) to indicate the cluster number in the file management system. In FAT file systems, the logical disk space of any logical drive is divided into a system area and a data area. BR - boot record Boot Record; RS - reserved sectors; FAT1, FAT2 - tables 1 and 2 of file allocation; RDir (Root directory, ROOT) – root directory. The data area is divided into clusters, which are 1 or more contiguous sectors. In the FAT table, clusters belonging to the same file are linked in a chain. The map of the data area is, in fact, the File Allocation Table (FAT) Each element of the FAT table (12, 16 or 32 bits) corresponds to one disk cluster and characterizes its state: free, busy or is a bad cluster (bad cluster) . The FAT16 file management system uses a 16-bit word to indicate the cluster number, and 65536 clusters can be addressed.

A cluster is the minimum addressable unit of disk space allocated for a file. A file or directory occupies an integer number of clusters. Splitting a data area into clusters instead of using sectors allows you to: reduce the size of the FAT table, reduce file fragmentation, reduce the length of file chains, speed up file access. The last cluster may not be fully utilized, resulting in a noticeable loss of disk space if the cluster size is large. On a floppy disk, a cluster occupies 1 or 2 sectors. On the hard disk - 4, 8, 16, 32, 64 - sectors in one cluster. Each element has the following structure: file name, file attribute, fallback field, creation time, creation date, last access date, fallback, last modification date, last modification time, initial Fat cluster number, file size.

In this example, the file named MyFile.txt is placed starting from the 8th cluster and spans 12 clusters. The chain of clusters for this case: 8,9,A,B,15,16,17,19,1A,1B,1C,1D. Cluster number 18 is marked as bad by code F7. It cannot be used to host data. This code is set by the disk formatting and checking utilities. The 1D cluster is marked with the FF code as the final one belonging to this file. Free clusters are marked with code 0. When a new cluster is allocated for writing to a file, the 1st free cluster is taken. Since files on the disk are changed, deleted, moved, enlarged and reduced, this placement rule leads to fragmentation, i.e. the data of one file is not located in adjacent clusters, and sometimes very remote from each other. A complex chain is formed. This results in slower file handling. Since Fat is used very intensively when accessing the disk, it is loaded into RAM. Fat32 is much more disk space efficient because it uses smaller clusters than previous versions of Fat. Compared to Fat16, this gives a savings of 10-16%.

A directory element in an attribute field can store the following values:

1) archive (installed when a file is changed and removed by a program that backs up files to another medium);

2) directory;

3) volume label;

4) systemic;

5) hidden;

6) read-only.

Long names in FAT32 are enforced using multiple directory entry entries: for a single file (one entry is one entry for the 8.3 name, and 24 entries for the longest name, which can be up to 256 characters long. Therefore, long names are not recommended.

Basically, the FAT file system is something to be avoided today. Therefore, it is vital to choose the right one that will allow you to avoid this file system.

NTFS: convenience and high speed.

One of the basic concepts used when working with NTFS is the concept of a volume. It is possible to create a fault-tolerant volume that occupies several partitions, that is, the use of RAID technology. NTFS divides the entire usable disk space of a volume into clusters - blocks of data addressed as units of data. NTFS supports cluster sizes from 512 bytes to 64 KB; 2 or 4 KB of the disk are allocated for the MFT zone - the space that can be occupied by the main MFT service metafile, increasing in size. Writing data to this area is not possible. The MFT zone is empty so that the service file (MFT) does not fragment as much as possible as it grows.

MFT (general file table) - a centralized directory of all other disk files, including itself. The MFT is divided into fixed size 1KB records, each record corresponding to a file. The first 16 files are of a service nature and are inaccessible to the operating system - they are called metafiles, and the very first metafile is the MFT itself. These first 16 MFT elements are the only part of the disk that has a strictly fixed position. A copy of these same 16 entries is kept in the middle of the volume for security, as they are very important. The remaining parts of the MFT file can be located in arbitrary places on the disk - you can restore its position using itself, "hooking" on the very basis - on the first element of the MFT. Each file in NTFS is represented by streams, it has no data, but "streams". One of the streams is the file data. You can define multiple data streams for a single file.

Main features of NTFS:

Work on large disks is efficient (much more efficient than in FAT);

There are means to restrict access to files and directories;

NTFS partitions provide local security for both files and directories;

A transaction mechanism has been introduced, in which file operations are logged;

Significant increase in reliability;

Removed many restrictions on the maximum number of disk sectors and/or clusters;

A file name in NTFS, unlike the FAT and HPFS file systems, can contain any characters, including the full set of national alphabets, since the data is presented in Unicode, a 16-bit representation that gives 65535 different characters. The maximum length of a filename in NTFS is 255 characters.

NTFS also has built-in compression that you can apply to individual files, entire directories, and even volumes (and then override or reassign them as you see fit). A directory in NTFS is a special file that stores links to other files and directories.

The main drawback of the NTFS file system is that service data takes up a lot of space (for example, each element of the directory takes 2 KB) - for small partitions, service data can take up to 25% of the media volume.

Thus, when choosing a file system type, we do not choose some abstract action, we make a set of decisions that affect the entire system as a whole. Why do you need to know all the ins and outs of the file system in such detail? This is necessary for its possible recovery, which we will discuss in one of the following articles =)

In addition to all other tasks, it fulfills its main purpose - it organizes work with data according to a certain structure. For these purposes, the file system is used. What is a FS and what it can be, as well as other information about it will be presented below.

general description

The file system is a part of the operating system that is responsible for placing, storing, deleting information on media, providing users and applications with this information, and ensuring its safe use. In addition, it is she who helps in data recovery in the event of a hardware or software failure. This is why the file system is so important. What is FS and what can it be? There are several types:

For hard drives, that is, devices with random access;

For magnetic tapes, that is, devices with serial access;

For optical media;

Virtual systems;

Network systems.5

The logical unit of data storage in the file system is a file, that is, an ordered collection of data that has a specific name. All data used by the operating system is presented in the form of files: programs, images, texts, music, videos, as well as drivers, libraries, and so on. Each such element has a name, type, extension, attributes, and size. So, now you know, the File system is a collection of such elements, as well as ways to work with them. Depending on the form in which it is used and what principles are applicable to it, several main types of FS can be distinguished.

Program approach

So, if a file system is considered (what it is and how to work with it), then it should be noted that this is a multi-level structure, at its top level there is a file system switch that provides an interface between the system and a specific application. It converts file requests into a format that is accepted by the next level - drivers. They, in turn, refer to specific device drivers that store the necessary information.

For client-server applications, the requirements for file system performance are quite high. Modern systems are designed to provide efficient access, support for large volumes of media, data protection from unauthorized access, and maintaining the integrity of information.

FAT file system

This type was developed back in 1977 by Bill Gates and Mark McDonald. It was originally used in OS 86-DOS. If we talk about what the FAT file system is, then it is worth noting that initially it was not able to support hard drives, but only worked with flexible media up to 1 megabyte. Now this restriction is no longer relevant, and this FS was used by Microsoft for MS-DOS 1.0 and subsequent versions. FAT uses certain file naming conventions:

The name must start with a letter or number, and it can contain any ASCII character, in addition to spaces and special elements;

The length of the name should be no more than 8 characters, after it a dot is placed, and then the extension is indicated, which consists of three letters;

File names can use any case, and are not distinguished or preserved.

Since FAT was originally designed for the single-user DOS operating system, it did not provide for the storage of data about the owner or access rights. At the moment, this file system is the most widespread, most support it to one degree or another. Its versatility makes it possible to use it on volumes that are being worked with by different operating systems. This is a simple FS that is not able to prevent file corruption due to incorrect computer shutdown. As part of operating systems based on it, there are special utilities that check the structure and correct file inconsistencies.

NTFS file system

This FS is the most preferred for working with Windows NT, as it was developed specifically for it. The OS includes the convert utility, which converts volumes with FAT and HPFS to NTFS volumes. If we talk about what the NTFS file system is, it is worth noting that it has significantly expanded the ability to control access to certain directories and files, introduced many attributes, implemented dynamic file compression tools, fault tolerance, and supports the requirements of the POSIX standard. In this FS, you can use names up to 255 characters long, while a short name in it is generated in the same way as in VFAT. Understanding what the NTFS file system is, it is worth noting that in the event of an operating system failure, it is able to recover itself, so the disk volume will remain available, and the directory structure will not suffer.

Features of NTFS

On an NTFS volume, each file is represented by an entry in the MFT table. The first 16 table entries are reserved by the file system itself for storing special information. The very first entry describes the file table itself. When the first record is destroyed, the second is read to find the mirror MFT file, where the first record is identical to the main table. The logical center of the disk contains a copy of the bootstrap file. The third entry in the table contains the log file, which is used for data recovery. The seventeenth and subsequent entries of the file table contain information about the files and directories that are on the hard disk.

The transaction log contains a complete set of operations that change the volume structure, including operations to create files, as well as any commands that affect the directory structure. The transaction log is designed to recover NTFS from a system failure. The entry for the root directory contains a list of the directories and files that are in the root directory.

EFS Features

The Encrypting File System (EFS) is a Windows feature that can store information on a hard drive in an encrypted format. Encryption has become the strongest protection that this operating system can offer. In this case, encryption for the user is a fairly simple action, for this you only need to check the box in the properties of the folder or file. You can specify who can read such files. Files are encrypted when they are closed, and when they are opened, they are automatically ready for use.

Features of RAW

Devices designed for data storage are the most vulnerable components, which are most often subject to damage not only physically, but also logically. Certain hardware problems can be fatal, while others have solutions. Sometimes users have a question: "What is the RAW file system?"

As you know, in order to write any information to a hard drive or flash drive, the drive must have a file system. The most common are FAT and NTFS. And RAW isn't even the file system we usually think of. In fact, this is a logical error of an already installed system, that is, its actual absence for Windows. Most often, RAW is associated with the destruction of the structure of the file system. After that, the OS does not just access the data, but also does not display technical information on the equipment.

UDF Features

The Universal Disk Format (UDF) is designed to replace CDFS and add support for DVD-ROM devices. If we talk about what it is, then this is a new implementation of the old version for which it meets the requirements. It is characterized by certain features:

Filenames can be up to 255 characters long;

The name can be lower or upper case;

The maximum path length is 1023 characters.

Starting with Windows XP, this file system is read/write.

This FS is used for flash drives that are supposed to be used when working with different computers running different operating systems, in particular Windows and Linux. It was EXFAT that became the “bridge” between them, since it is able to work with data received from the OS, each of which has its own file system. What it is and how it works will be clear in practice.

conclusions

As is clear from the above, each operating system uses certain file systems. They are intended for storing ordered data structures on physical media. If you suddenly, when using a computer, have a question about what the final file system is, then it is quite possible that when you tried to copy a certain file to the media, you received a message about exceeding the allowed size. That is why it is necessary to know in which file system what file size is considered acceptable so that you do not encounter problems when transferring information.

Thematic materials: