RTEMS 6.1-rc1
|
Files | |
file | bdbuf.h |
Block Device Buffer Management. | |
file | bdbuf.c |
Data Structures | |
struct | rtems_bdbuf_buffer |
struct | rtems_bdbuf_group |
struct | rtems_bdbuf_config |
Macros | |
#define | RTEMS_BDBUF_MAX_READ_AHEAD_BLOCKS_DEFAULT 0 |
#define | RTEMS_BDBUF_MAX_WRITE_BLOCKS_DEFAULT 16 |
#define | RTEMS_BDBUF_SWAPOUT_TASK_PRIORITY_DEFAULT 15 |
#define | RTEMS_BDBUF_SWAPOUT_TASK_SWAP_PERIOD_DEFAULT 250 |
#define | RTEMS_BDBUF_SWAPOUT_TASK_BLOCK_HOLD_DEFAULT 1000 |
#define | RTEMS_BDBUF_SWAPOUT_WORKER_TASKS_DEFAULT 0 |
#define | RTEMS_BDBUF_SWAPOUT_WORKER_TASK_PRIORITY_DEFAULT RTEMS_BDBUF_SWAPOUT_TASK_PRIORITY_DEFAULT |
#define | RTEMS_BDBUF_READ_AHEAD_TASK_PRIORITY_DEFAULT RTEMS_BDBUF_SWAPOUT_TASK_PRIORITY_DEFAULT |
#define | RTEMS_BDBUF_TASK_STACK_SIZE_DEFAULT RTEMS_MINIMUM_STACK_SIZE |
#define | RTEMS_BDBUF_CACHE_MEMORY_SIZE_DEFAULT (64 * 512) |
#define | RTEMS_BDBUF_BUFFER_MIN_SIZE_DEFAULT (512) |
#define | RTEMS_BDBUF_BUFFER_MAX_SIZE_DEFAULT (4096) |
Typedefs | |
typedef struct rtems_bdbuf_group | rtems_bdbuf_group |
typedef struct rtems_bdbuf_buffer | rtems_bdbuf_buffer |
typedef struct rtems_bdbuf_config | rtems_bdbuf_config |
Enumerations | |
enum | rtems_bdbuf_buf_state { RTEMS_BDBUF_STATE_FREE = 0 , RTEMS_BDBUF_STATE_EMPTY , RTEMS_BDBUF_STATE_CACHED , RTEMS_BDBUF_STATE_ACCESS_CACHED , RTEMS_BDBUF_STATE_ACCESS_MODIFIED , RTEMS_BDBUF_STATE_ACCESS_EMPTY , RTEMS_BDBUF_STATE_ACCESS_PURGED , RTEMS_BDBUF_STATE_MODIFIED , RTEMS_BDBUF_STATE_SYNC , RTEMS_BDBUF_STATE_TRANSFER , RTEMS_BDBUF_STATE_TRANSFER_PURGED } |
State of a buffer of the cache. More... | |
Variables | |
const rtems_bdbuf_config | rtems_bdbuf_configuration |
The Block Device Buffer Management implements a cache between the disk devices and file systems. The code provides read-ahead and write queuing to the drivers and fast cache look-up using an AVL tree.
The block size used by a file system can be set at runtime and must be a multiple of the disk device block size. The disk device's physical block size is called the media block size. The file system can set the block size it uses to a larger multiple of the media block size. The driver must be able to handle buffers sizes larger than one media block.
The user configures the amount of memory to be used as buffers in the cache, and the minimum and maximum buffer size. The cache will allocate additional memory for the buffer descriptors and groups. There are enough buffer descriptors allocated so all the buffer memory can be used as minimum sized buffers.
The cache is a single pool of buffers. The buffer memory is divided into groups where the size of buffer memory allocated to a group is the maximum buffer size. A group's memory can be divided down into small buffer sizes that are a multiple of 2 of the minimum buffer size. A group is the minimum allocation unit for buffers of a specific size. If a buffer of maximum size is request the group will have a single buffer. If a buffer of minimum size is requested the group is divided into minimum sized buffers and the remaining buffers are held ready for use. A group keeps track of which buffers are with a file system or driver and groups who have buffer in use cannot be realloced. Groups with no buffers in use can be taken and realloced to a new size. This is how buffers of different sizes move around the cache.
The buffers are held in various lists in the cache. All buffers follow this state machine:
Empty or cached buffers are added to the LRU list and removed from this queue when a caller requests a buffer. This is referred to as getting a buffer in the code and the event get in the state diagram. The buffer is assigned to a block and inserted to the AVL based on the block/device key. If the block is to be read by the user and not in the cache it is transfered from the disk into memory. If no buffers are on the LRU list the modified list is checked. If buffers are on the modified the swap out task will be woken. The request blocks until a buffer is available for recycle.
A block being accessed is given to the file system layer and not accessible to another requester until released back to the cache. The same goes to a buffer in the transfer state. The transfer state means being read or written. If the file system has modified the block and releases it as modified it placed on the cache's modified list and a hold timer initialised. The buffer is held for the hold time before being written to disk. Buffers are held for a configurable period of time on the modified list as a write sets the state to transfer and this locks the buffer out from the file system until the write completes. Buffers are often accessed and modified in a series of small updates so if sent to the disk when released as modified the user would have to block waiting until it had been written. This would be a performance problem.
The code performs multiple block reads and writes. Multiple block reads or read-ahead increases performance with hardware that supports it. It also helps with a large cache as the disk head movement is reduced. It however is a speculative operation so excessive use can remove valuable and needed blocks from the cache. The read-ahead is triggered after two misses of ascending consecutive blocks or a read hit of a block read by the most-resent read-ahead transfer. The read-ahead works per disk, but all transfers are issued by the read-ahead task.
The cache has the following lists of buffers:
A cache look-up will be performed to find a suitable buffer. A suitable buffer is one that matches the same allocation size as the device the buffer is for. The a buffer's group has no buffers in use with the file system or driver the group is reallocated. This means the buffers in the group are invalidated, resized and placed on the LRU queue. There is a performance issue with this design. The reallocation of a group may forced recently accessed buffers out of the cache when they should not. The design should be change to have groups on a LRU list if they have no buffers in use.
#define RTEMS_BDBUF_BUFFER_MAX_SIZE_DEFAULT (4096) |
Default maximum size of buffers.
#define RTEMS_BDBUF_BUFFER_MIN_SIZE_DEFAULT (512) |
Default minimum size of buffers.
#define RTEMS_BDBUF_CACHE_MEMORY_SIZE_DEFAULT (64 * 512) |
Default size of memory allocated to the cache.
#define RTEMS_BDBUF_MAX_READ_AHEAD_BLOCKS_DEFAULT 0 |
The default value for the maximum read-ahead blocks disables the read-ahead feature.
#define RTEMS_BDBUF_MAX_WRITE_BLOCKS_DEFAULT 16 |
Default maximum number of blocks to write at once.
#define RTEMS_BDBUF_READ_AHEAD_TASK_PRIORITY_DEFAULT RTEMS_BDBUF_SWAPOUT_TASK_PRIORITY_DEFAULT |
Default read-ahead task priority. The same as the swap-out task.
#define RTEMS_BDBUF_SWAPOUT_TASK_BLOCK_HOLD_DEFAULT 1000 |
Default swap-out task block hold time in milli seconds.
#define RTEMS_BDBUF_SWAPOUT_TASK_PRIORITY_DEFAULT 15 |
Default swap-out task priority.
#define RTEMS_BDBUF_SWAPOUT_TASK_SWAP_PERIOD_DEFAULT 250 |
Default swap-out task swap period in milli seconds.
#define RTEMS_BDBUF_SWAPOUT_WORKER_TASK_PRIORITY_DEFAULT RTEMS_BDBUF_SWAPOUT_TASK_PRIORITY_DEFAULT |
Default swap-out worker task priority. The same as the swap-out task.
#define RTEMS_BDBUF_SWAPOUT_WORKER_TASKS_DEFAULT 0 |
Default swap-out worker tasks. Currently disabled.
#define RTEMS_BDBUF_TASK_STACK_SIZE_DEFAULT RTEMS_MINIMUM_STACK_SIZE |
Default task stack size for swap-out and worker tasks.
typedef struct rtems_bdbuf_buffer rtems_bdbuf_buffer |
To manage buffers we using buffer descriptors (BD). A BD holds a buffer plus a range of other information related to managing the buffer in the cache. To speed-up buffer lookup descriptors are organized in AVL-Tree. The fields 'dd' and 'block' are search keys.
typedef struct rtems_bdbuf_config rtems_bdbuf_config |
Buffering configuration definition. See confdefs.h for support on using this structure.
State of a buffer of the cache.
The state has several implications. Depending on the state a buffer can be in the AVL tree, in a list, in use by an entity and a group user or not.
State | Valid Data | AVL Tree | LRU List | Modified List | Synchronization List | Group User | External User |
---|---|---|---|---|---|---|---|
FREE | X | ||||||
EMPTY | X | ||||||
CACHED | X | X | X | ||||
ACCESS CACHED | X | X | X | X | |||
ACCESS MODIFIED | X | X | X | X | |||
ACCESS EMPTY | X | X | X | ||||
ACCESS PURGED | X | X | X | ||||
MODIFIED | X | X | X | X | |||
SYNC | X | X | X | X | |||
TRANSFER | X | X | X | X | |||
TRANSFER PURGED | X | X | X |
rtems_status_code rtems_bdbuf_get | ( | rtems_disk_device * | dd, |
rtems_blkdev_bnum | block, | ||
rtems_bdbuf_buffer ** | bd | ||
) |
Get block buffer for data to be written into. The buffers is set to the access or modified access state. If the buffer is in the cache and modified the state is access modified else the state is access. This buffer contents are not initialised if the buffer is not already in the cache. If the block is already resident in memory it is returned how-ever if not in memory the buffer is not read from disk. This call is used when writing the whole block on a disk rather than just changing a part of it. If there is no buffers available this call will block. A buffer obtained with this call will not be involved in a transfer request and will not be returned to another user until released. If the buffer is already with a user when this call is made the call is blocked until the buffer is returned. The highest priority waiter will obtain the buffer first.
The block number is the linear block number. This is relative to the start of the partition on the media.
Before you can use this function, the rtems_bdbuf_init() routine must be called at least once to initialize the cache, otherwise a fatal error will occur.
dd | [in] The disk device. |
block | [in] Linear media block number. |
bd | [out] Reference to the buffer descriptor pointer. |
RTEMS_SUCCESSFUL | Successful operation. |
RTEMS_INVALID_ID | Invalid block number. |
rtems_status_code rtems_bdbuf_init | ( | void | ) |
Prepare buffering layer to work - initialize buffer descritors and (if it is neccessary) buffers. After initialization all blocks is placed into the ready state.
RTEMS_SUCCESSFUL | Successful operation. |
RTEMS_CALLED_FROM_ISR | Called from an interrupt context. |
RTEMS_INVALID_NUMBER | The buffer maximum is not an integral multiple of the buffer minimum. The maximum read-ahead blocks count is too large. |
RTEMS_RESOURCE_IN_USE | Already initialized. |
RTEMS_UNSATISFIED | Not enough resources. |
void rtems_bdbuf_peek | ( | rtems_disk_device * | dd, |
rtems_blkdev_bnum | block, | ||
uint32_t | nr_blocks | ||
) |
Give a hint which blocks should be cached next.
Provide a hint to the read ahead mechanism which blocks should be cached next. This overwrites the default linear pattern. You should use it in (for example) a file system to tell bdbuf where the next part of a fragmented file is. If you know the length of the file, you can provide that too.
Before you can use this function, the rtems_bdbuf_init() routine must be called at least once to initialize everything. Otherwise you might get unexpected results.
dd | [in] The disk device. |
block | [in] Linear media block number. |
nr_blocks | [in] Number of consecutive blocks that can be pre-fetched. |
void rtems_bdbuf_purge_dev | ( | rtems_disk_device * | dd | ) |
Purges all buffers corresponding to the disk device dd.
This may result in loss of data. The read-ahead state of this device is reset.
Before you can use this function, the rtems_bdbuf_init() routine must be called at least once to initialize the cache, otherwise a fatal error will occur.
dd | [in] The disk device. |
rtems_status_code rtems_bdbuf_read | ( | rtems_disk_device * | dd, |
rtems_blkdev_bnum | block, | ||
rtems_bdbuf_buffer ** | bd | ||
) |
Get the block buffer and if not already in the cache read from the disk. If specified block already cached return. The buffer is set to the access or modified access state. If the buffer is in the cache and modified the state is access modified else the state is access. If block is already being read from disk for being written to disk this call blocks. If the buffer is waiting to be written it is removed from modified queue and returned to the user. If the buffer is not in the cache a new buffer is obtained and the data read from disk. The call may block until these operations complete. A buffer obtained with this call will not be involved in a transfer request and will not be returned to another user until released. If the buffer is already with a user when this call is made the call is blocked until the buffer is returned. The highest priority waiter will obtain the buffer first.
Before you can use this function, the rtems_bdbuf_init() routine must be called at least once to initialize the cache, otherwise a fatal error will occur.
dd | [in] The disk device. |
block | [in] Linear media block number. |
bd | [out] Reference to the buffer descriptor pointer. |
RTEMS_SUCCESSFUL | Successful operation. |
RTEMS_INVALID_ID | Invalid block number. |
RTEMS_IO_ERROR | IO error. |
rtems_status_code rtems_bdbuf_release | ( | rtems_bdbuf_buffer * | bd | ) |
Release the buffer obtained by a read call back to the cache. If the buffer was obtained by a get call and was not already in the cache the release modified call should be used. A buffer released with this call obtained by a get call may not be in sync with the contents on disk. If the buffer was in the cache and modified before this call it will be returned to the modified queue. The buffers is returned to the end of the LRU list.
Before you can use this function, the rtems_bdbuf_init() routine must be called at least once to initialize the cache, otherwise a fatal error will occur.
bd | [in] Reference to the buffer descriptor. The buffer descriptor reference must not be NULL and must be obtained via rtems_bdbuf_get() or rtems_bdbuf_read(). |
RTEMS_SUCCESSFUL | Successful operation. |
RTEMS_INVALID_ADDRESS | The reference is NULL. |
rtems_status_code rtems_bdbuf_release_modified | ( | rtems_bdbuf_buffer * | bd | ) |
Release the buffer allocated with a get or read call placing it on the modified list. If the buffer was not released modified before the hold timer is set to the configuration value. If the buffer had been released modified before but not written to disk the hold timer is not updated. The buffer will be written to disk when the hold timer has expired, there are not more buffers available in the cache and a get or read buffer needs one or a sync call has been made. If the buffer is obtained with a get or read before the hold timer has expired the buffer will be returned to the user.
Before you can use this function, the rtems_bdbuf_init() routine must be called at least once to initialize the cache, otherwise a fatal error will occur.
bd | [in] Reference to the buffer descriptor. The buffer descriptor reference must not be NULL and must be obtained via rtems_bdbuf_get() or rtems_bdbuf_read(). |
RTEMS_SUCCESSFUL | Successful operation. |
RTEMS_INVALID_ADDRESS | The reference is NULL. |
rtems_status_code rtems_bdbuf_set_block_size | ( | rtems_disk_device * | dd, |
uint32_t | block_size, | ||
bool | sync | ||
) |
Sets the block size of a disk device.
This will set the block size derived fields of the disk device. If requested the disk device is synchronized before the block size change occurs. Since the cache is unlocked during the synchronization operation some tasks may access the disk device in the meantime. This may result in loss of data. After the synchronization the disk device is purged to ensure a consistent cache state and the block size change occurs. This also resets the read-ahead state of this disk device. Due to the purge operation this may result in loss of data.
Before you can use this function, the rtems_bdbuf_init() routine must be called at least once to initialize the cache, otherwise a fatal error will occur.
dd | [in, out] The disk device. |
block_size | [in] The new block size in bytes. |
sync | [in] If true , then synchronize the disk device before the block size change. |
RTEMS_SUCCESSFUL | Successful operation. |
RTEMS_INVALID_NUMBER | Invalid block size. |
rtems_status_code rtems_bdbuf_sync | ( | rtems_bdbuf_buffer * | bd | ) |
Release the buffer as modified and wait until it has been synchronized with the disk by writing it. This buffer will be the first to be transfer to disk and other buffers may also be written if the maximum number of blocks in a requests allows it.
Before you can use this function, the rtems_bdbuf_init() routine must be called at least once to initialize the cache, otherwise a fatal error will occur.
bd | [in] Reference to the buffer descriptor. The buffer descriptor reference must not be NULL and must be obtained via rtems_bdbuf_get() or rtems_bdbuf_read(). |
RTEMS_SUCCESSFUL | Successful operation. |
RTEMS_INVALID_ADDRESS | The reference is NULL. |
rtems_status_code rtems_bdbuf_syncdev | ( | rtems_disk_device * | dd | ) |
Synchronize all modified buffers for this device with the disk and wait until the transfers have completed. The sync mutex for the cache is locked stopping the addition of any further modified buffers. It is only the currently modified buffers that are written.
Before you can use this function, the rtems_bdbuf_init() routine must be called at least once to initialize the cache, otherwise a fatal error will occur.
dd | [in] The disk device. |
RTEMS_SUCCESSFUL | Successful operation. |
|
extern |
External reference to the configuration.
The configuration is provided by the application.