Online Archive
Personal Storage Files (PSTs) are a problem for many Exchange users because they are not backed up or physically protected and can contain confidential information or have data needed for legal discovery. This information can be lost or fall into the hands of unauthorized people. Many solutions have been put on the market to help control the PST sprawl in a company. Some of these solutions will search for and then gather and index the data within the PSTs and store them in centralized repository. These solutions can scale well and have become fairly mature products. However they often require a separate access method or Outlook plug-ins to access the archived data. The online archive, although it lacks some of the more sophisticated tools that third-party products offer, does allow access to the archived data through Microsoft Office Outlook 2010 as well as through Outlook Web App (OWA) without requiring the installation of additional software on each client computer or device.
High-Availability Improvements
In recent years the market trend has been to use larger and more sophisticated SANs (storage area networks) to provide redundancy and failover solutions.
In stark contrast, the Exchange product team has been reducing the need for expensive SANs and complicated replication topologies by reducing the disk I/O requirements for the product and by building robust replication technologies directly into Exchange that are better able to understand the health of the application. The two factors that influence the high-availability options in Exchange Server 2010 are the reduction in disk I/O requirements and subsequent improved performance on inexpensive and non-RAID protected disk, and the introduction of the DAG.
Note: With the emergence of cloud computing, the trend has been toward inexpensive, highly available computing and storage. Cloud computing vendors offer large amounts of storage over the Internet for a small monthly fee. To drive down costs many of these storage vendors rely on commodity hardware and custom-written software to provide redundancy rather than expensive RA ID and SAN configurations. The storage is often comprised of low and mid-tier high-capacity hard drives attached to server computers without RA ID controllers or SAN connectivity. RA ID and SAN configurations copy data at a bit level on the disk, usually with little understanding of the files and format of the data.
Clients then attach to the service using a common storage protocol or proprietary AP I, and the storages service takes care of making multiple copies of the data, either on separate disks on different servers in the same site or in multiple sites to provide redundancy. This configuration for cloud storage vendors saves money during the initial purchase because it does not require a SAN configuration, and it does not require specialized datacenter staff to manage and maintain a SAN. This also provides fewer single-points-of-failure because the data is spread out on multiple disks and multiple servers. If any single failure were to occur, the likelihood that it would affect data availability is greatly reduced.
Traditional SANs will present a logical unit number (LUN) to a host server that has a specific RA ID type. All data that is stored on this LUN is striped or mirrored across all of the physical disks on the underlying group. Therefore, all files stored on this LUN share the same redundancy. Using software to provide redundancy on multiple independent storage devices provides additional flexibility because each file can be given a different protection level and copied as many times as needed for redundancy and availability.
The cloud is not the only place this trend away from hardware RA ID has been seen. Microsoft Home Server, a consumer product aimed at home storage networking, also uses software to control data redundancy at a file level rather than relying on software or hardware RA ID. This allows the product to aggregate storage across different hard drive sizes and technologies and provides a unified files system to store data and still provide redundancy in case of inevitable hardware failure. It is clear this is a new trend toward file-based, RA ID-less redundancy in the industry-a trend that is apparent in the design of Exchange 2010.
Exchange Server 2007 introduced a number of new options for availability that previously required special network and SAN hardware. The following availability options are available in Exchange Server 2007 SP1:
- Cluster Continuous Replication (CCR) is a two-node clustering solution that requires no shared storage and maintains two full copies of the databases to protect against server and storage failures. The data is kept in sync by replicating the transaction logs from the active to the passive node of the cluster. The passive node then applies the transaction logs to the passive copy of the database.
- Standby Copy Replication (SCR) is an availability option that does not require clustering, and can provide multiple passive database copies on multiple servers. The database copies can be set to not apply logs for up to seven days so as to provide for point-in-time recovery. Activating a passive copy requires administrative intervention. SCR was first introduced in Exchange Server 2007 SP1.
- Single Copy Cluster is similar to traditional Exchange clustering in that it does use shared storage; however, in Exchange Server 2007 the setup process, stability, and overall management process was greatly improved.
- Local Continuous Replication is a single-server, data-redundancy solution that maintains two full copies of the databases to protect against local storage failures. Activation of the passive copy requires administrative intervention.
The improvements included in Exchange Server 2010 combine many of the benefits available in Exchange Server 2007 SP1 along with a few other improvements into the DAG. The DAG is a group of up to 16 Exchange Server 2010 Mailbox servers that can each maintain up to 100 databases with a total of up to 16 copies of each database on as many different servers in the DAG.
The DAG differs from Exchange Server 2007 SP1 in the following ways:
- With CCR, there can be only two copies of the database within the cluster; within the DAG there can be up to 16 copies of each database.
- With SCR, the activation process required administrative intervention; within a DAG, failover between individual database copies can happen automatically.
- With SCC, a single copy of the database consumes less storage but provides no redundancy. There is no configuration in Exchange Server 2010 that replaces this functionality, although some third-party solutions may provide similar functionality.
- With LCR, a single-server configuration allows two copies of a database to reside on different storage connected to the same server. There is no configuration in Exchange Server 2010 that replaces this functionality.
Note: During the development of Exchange Server 2010, a small number of people hung onto the legacy high-availability and data redundancy solutions such as SCC, and requested that these functions be retained in the product. In the end, the benefits did not outweigh the negatives. SCC was usually deployed in situations where failover operations needed to be minimized or where companies already had a significant investment in SAN hardware. However, SCC was never promoted as a best practice by Microsoft. There are a number of reasons for the diminishing return of SCC:'The fact remained that the solution still maintained the most critical single point of failure-a single copy of the database-did not meet the requirements of customers. A DAG configuration covers the single-point-of-failure problem and also protects against hardware and operating system issues. With the reduction in the required disk I/O, high-performance SAN-based disk is no longer required. Therefore, even if double the amount of disk is required to provide two copies of the database, quite often the storage costs are still significantly less for an Exchange Server 2010 DAG over a similarly configured Exchange Server 2007 SCC cluster.
- It does not cover for storage failures.
- It requires third-party products for multi-site failover.
- It only provides protection from hardware failures and operating system issues and reduces downtime for software updates.
- It requires shared SAN storage-often the most expensive disk.
- In the event of an information store issue, the service restarts on the same cluster node without any protection from downtime.
In this tutorial:
- Mailbox Services in Exchange Server
- Exchange Server Mailbox Services
- Exchange Mailbox Services Architecture
- The Exchange Services
- Deleted Item Recovery and Dumpster 2.0
- Discontinuation of Storage Groups
- Increased Database Page Size
- I/O Operations
- Online Archive
- Exchange Mailbox Services Configuration
- Database Maintenance
- Mailbox Limits
- Poison Mailbox Detection and Correction
- Client Configuration
- Configuring Public Folders
- Configuring Public Folders for Site Redundancy