Volume
Manager:
In computer storage, logical volume management or LVM provides a
method of allocating space on mass-storage devices
that is more flexible than conventional partitioning schemes. In particular, a volume manager can
concatenate, stripe together or otherwise combine partitions into
larger virtual ones that administrators can re-size or move, potentially
without interrupting system use.
The Volume Manager builds virtual
devices called volumes on top of physical
disks. Volumes are accessed by a UNIX file system,
a database, or other applications in the same way physical disk partitions
would be accessed. Volumes are composed of other virtual objects that can be
manipulated to change the volume's configuration. Volumes and their virtual
components are referred to as Volume Manager objects. Volume Manager
objects can be manipulated in a variety of ways to optimize performance,
provide redundancy of data, and perform backups or other administrative tasks
on one or more physical disks without interrupting applications. As a result,
data availability and disk subsystem throughput are improved.
Large file systems require the capacity
of several disks, but most file systems must be created on a single device. A
hardware RAID device is one solution to this problem. A hardware RAID device
appears as a single device while in fact containing several disk drives
internally. There are other excellent benefits of hardware RAID, but it is an
expensive solution if one simply needs to make many small disks look like a
single big disk. Volume managers are the software solution to this problem. A
volume manager is typically a mid-level block device driver (often called a
volume driver) which makes many disks appear as a single logical disk. In
addition to existing in the kernel's block I/O path, a volume manager requires
user level programs to configure and manage partitions and volumes. The virtualized
storage perspective produced by volume managers is so useful that often all
storage, including hardware RAID, is controlled with a volume manager.
A physical disk is the
underlying storage device (media), which may or may not be under Volume Manager
control. A physical disk can be accessed using a device name such
as c#b#t#d#, where c# is the controller, b# is the
bus, t# is the target ID, and d# is the disk number.
A physical disk can be divided into one
or more partitions. The partition number, or s#, is given at the
end of the device name
Storage
Area Network and SAN Protocols:
Storage Area Network (SAN) is a
high-speed network or subnetwork whose primary purpose is to transfer data
between computer and storage systems. A storage device is a machine that
contains nothing but a disk or disks for storing data. A SAN consists of a
communication infrastructure, which provides physical connections; and a
management layer, which organizes the connections, storage elements, and
computer systems so that data transfer is secure and robust.
Typically, a storage area network is
part of the overall network of computing resources for an enterprise. A storage
area network is usually clustered in close proximity to other computing
resources but may also extend to remote locations for backup and archival
storage. SANs support disk mirroring, backup and restore, archival and
retrieval of archived data, data migration from one storage device to another,
and the sharing of data among different servers in a network. SANs can
incorporate subnetworks with network-attached storage (NAS) systems.
There are a few SAN technologies
available in today's implementations, such as IBM's optical fiber ESCON which
is enhanced by FICON architecture, or the newer Fibre Channel technology. High
speed Ethernet is also used in the storage Area Network for connection. SCSI
and iSCSI are popular technologies used in the Storage Area Network.
SAN's architecture works in a way that makes all
storage devices available to all servers on a LAN or WAN. As more storage devices are added to a SAN, they too
will be accessible from any server in the larger network. A Storage Area
Network can be anything from two servers on a network accessing a central pool
of storage devices to several thousand servers accessing many millions of
megabytes of storage.
iSCSI:
Internet Small Computer System Interface:
Internet Small Computer System Interface
(iSCSI) is a TCP/IP-based protocol for establishing and managing connections
between IP-based storage devices, hosts and clients, which is called Storage
Area Network (SAN). The SAN makes possible to use the SCSI protocol in network
infrastructures for high-speed data transfer at the block level between
multiple elements of data storage networks.
The architecture of the SCSI is based on the client/server model, which is
mostly implemented in an environment where devices are very close to each other
and connected with SCSI buses. Encapsulation and reliable delivery of bulk data
transactions between initiators and targets through the TCP/IP network is the
main function of the iSCSI. iSCSI provides mechanism for encapsulating SCSI
commands on an IP network and operates on top of TCP.
For today - SAN (Storage Area Network),
the key requirements of data communication are: 1) Consolidation of data
storage systems, 2) Data backup, 3) Server clusterization, 4) Replication, 5)
Data recovery in emergency conditions. In addition, SAN is likely geographic
distribution over multiple LANs and WANs with various technologies. All
operations must be conducted in a secure environment and with QoS. iSCSI is
designed to perform the above functions in the TCP/IP network safely and with
proper QoS.
The
iSCSI has four components:
·
iSCSI Address and Naming Conventions: An iSCSI node is an identifier of SCSI devices (in a
network entity) available through the network. Each iSCSI node has a unique
iSCSI name (up to 255 bytes) which is formed according to the rules adopted for
Internet nodes.
·
iSCSI Session Management: The iSCSI session consists of a Login Phase and a
Full Feature Phase which is completed with a special command.
·
iSCSI Error Handling: Because of a high probability of errors in data delivery in some IP
networks, especially WAN, where the iSCSI can work, the protocol provides a
great deal of measures for handling errors.
·
iSCSI Security:
As the iSCSI can be used in networks where data can be accessed illegally, the
protocol allows different security methods.
By carrying SCSI commands over IP
networks, iSCSI is used to facilitate data transfers over intranets and to
manage storage over long distances. iSCSI can be used to transmit data
over local area networks (LANs), wide area networks (WANs), or the Internet and
can enable location-independent data storage and retrieval.
The protocol allows clients (called initiators(In the relationship between your computer and the storage
device, your computer is called an initiator because
it initiates the connection to the device, which is called a target.)) to send SCSI commands (CDBs) to
SCSI storage devices (targets) on remote servers. It is a storage area
network (SAN) protocol, allowing
organizations to consolidate storage into data center storage arrays while
providing hosts (such as database and web servers) with the illusion of locally
attached disks. Unlike traditional Fibre
Channel, which requires special-purpose cabling, iSCSI can be run over long
distances using existing network infrastructure.
Oracle
Clusterware :
Oracle Clusterware is
software that enables servers to operate together as if they are one server.
Each server looks like any standalone server. However, each server has
additional processes that communicate with each other so the separate servers
appear as if they are one server to applications and end users. In addition
Oracle Clusterware enables the protection of any Oracle application or any
other kind of application within a cluster.
A cluster is a group of independent
servers used in a network that cooperate as a single system. Clustering is a
technique used to create a highly available and easily scalable environment.
Cluster software is the software running on each of these servers that provides
the intelligence, which enables the coordinated cooperation of those servers.
If one of the cluster servers fails, the work previously running on that server
can be restarted on another available server in the cluster.
Clusterware monitors all components like
instances and listeners. There are two important components in Oracle
clusterware, Voting Disk and OCR (Oracle Cluster Registry).
Voting disk and the OCR is created on shared storage during Oracle
Clusterware installation process.
OCR
File :- Cluster configuration
information is maintained in Oracle Cluster Registry file. OCR relies on a
distributed shared-cache architecture for optimizing queries against the
cluster repository. Each node in the cluster maintains an in-memory copy of
OCR, along with an OCR process that accesses its OCR cache.
When OCR client application needs to
update the OCR, they communicate through their local OCR process to the OCR
process that is performing input/output (I/O) for writing to the repository on
disk.
The OCR client applications are Oracle
Universal Installer (OUI), SRVCTL, Enterprise Manger (EM), Database
Configuration Assistant (DBCA), Database Upgrade Assistant(DBUA), NetCA and
Virtual Internet Protocol Configuration assistant (VIPCA). OCR also maintains
dependency and status information for application resources defined within CRS,
specifically databases, instances, services and node applications.
Note:- The name of the
configuration file is ocr.loc and the configuration file variable is
ocrconfig.loc
Oracle
Cluster Registry (OCR) :- resides on
shared storage and maintains information about cluster configuration and
information about cluster database. OCR contains information like which
database instances run on which nodes and which services runs on which database.
The OCR also manages information about processes that Oracle Clusterware
controls. The OCR stores configuration information in a series of key-value
pairs within a directory tree structure. The OCR must reside on shared disk that
is accessible by all of the nodes in your cluster. The Oracle Clusterware can
multiplex the OCR and Oracle recommends that you use this feature to ensure
cluster high availability.
Note:- You can replace a failed OCR
online, and you can update the OCR through supported APIs such as Enterprise
Manager, the Server Control Utility (SRVCTL), or the Database Configuration
Assistant (DBCA
Voting Disk: - Manages cluster
membership by way of a health check and arbitrates cluster ownership among the
instances in case of network failures. RAC uses the voting disk to determine
which instances are members of a cluster. The voting disk must reside on shared
disk. For high availability, Oracle recommends that you have multiple voting
disks. The Oracle Clusterware enables multiple voting disks.
There isn’t really any useful data kept
in the voting disk. So, if you lose voting disks, you can simply add them back
without losing any data. But, of course, losing voting disks can lead to node
reboots. If you lose all voting disks, then you will have to keep the CRS
daemons down, then only you can add the voting disks
Cache
Fusion:
Cache Fusion is disk less cache coherency
mechanism in Oracle RAC that provides copies of data
blocks directly from one instance’s memory cache (in which that block is
available) to other instance (instance which is request for specific data
block). Cache Fusion provides single buffer cache (for all instances in
cluster) through interconnect.
In Single Node oracle
database, an instance looking for data block first checks in cache, if block is
not in cache then goes to disk to pull block from disk to cache and return block
to client.
In RAC Database there
is remote cache so instance should look not only in local cache
(cache local to instance) but on remote cache (cache on remote instance). If
cache is available in local cache then it should return data block from local
cache; if data block is not in local cache, instead of going to disk it
should first go to remote cache (remote instance) to check if block is
available in local cache (via interconnect)
This is because accessing data block
from remote cache is faster than accessing it from disk.
Heart
Beat:
A heartbeat is a polling mechanism,
similar to a ping, that monitors the availability of other servers in a RAC
system. The heartbeat is a type of polling mechanism that is sent
over the cluster interconnect to ensure that all RAC nodes are available.
The heartbeat is part of the clusterware
node monitoring. When a node does not respond to a heartbeat signal, the
instance is assumed to have crashed and it is "evicted"(expelled or
quit) from the cluster.
Public
IP, Private IP, Virtual IP and DNS SERVER:
Public
IP: The public IP address name must
be resolvable to the hostname. You can register both the public IP and the VIP
address with the DNS. If you do not have a DNS, then you must make sure that
both public IP addresses are in the node /etc/hosts file (for all
cluster nodes)
Private
IP: A private IP address for each
node serves as the private interconnect address for internode cluster
communication only. Oracle RAC requires "private IP" addresses to
manage the CRS, the clusterware heartbeat process and the cache fusion layer
Virtual
IP: A public internet protocol (IP)
address for each node, to be used as the Virtual IP address (VIP) for client
connections. If a node fails, then Oracle Clusterware fails over the VIP
address to an available node. This address should be in the/etc/hosts file
on any node. The VIP should not be in use at the time of the installation,
because this is an IP address that Oracle Clusterware manages. Oracle uses a
Virtual IP (VIP) for database access. The VIP must be on the same subnet
as the public IP address. The VIP is used for RAC failover (TAF).
DNS
SERVER: The Domain Name System (DNS)is a standard technology for managing the names of Web
sites and other Internet domains. DNS technology allows you to type names into
your Web browser like redshoretech.com and your computer to automatically
find that address on the Internet. A key element of the DNS is a worldwide
collection of DNS servers.
A DNS server is any
computer registered to join the Domain Name System. A DNS server runs
special-purpose networking software, features a public IP address,
and contains a database of network names and addresses for other Internet
hosts.
LUN
(LOGICAL UNIT NUMBER):
If suppose we got a large storage array,
and requirement is to not allow one server to use all storage spaces, so it
need to divided into logical units as LUN(Logical Unit Number). So LUN allow us
slice storage array into usable storage chunks and present same to server. LUN
basically refer to either a entire physical volume or subset of larger physical
disk or volume. LUN represent logical abstraction or you can say virtual layer
between physical disk and application. A LUN is scsi concept.
As we know most storage devices use SCSI command set to communicate. In simple
words you can say the devices which are connected via SCSI parallel bus are
controlled with SCSI command set.
A LUN on a scsi parallel bus is is used to electrically address the devices. Multiple
devices appear on single connection because of LUN. So finally I can say for a
system admin LUN is a uniquely identifiable storage device.