Snapshot oracle что это

Snapshots

A snapshot is a staged copy of data in a Data Store that is used in one or more processes.

Note that you do not have to copy the data that you are working with, but doing so allows you greater access to Director’s results browsing functionality, as you are able to drill down on processor metrics to see the data itself, at each stage in your processing.

Commonly, you might take a copy of the data when working on an audit process, or when defining the rules for data cleansing, but you might run a process in streaming mode (that is, without copying data into the repository) when you run a data cleansing process in production, in order to save time in execution. See the Performance Tuning Guide for more information.

You may define the following properties of a snapshot:

Once a snapshot configuration has been added, you can run the snapshot by right-clicking on it in the Project Browser, and selecting Run Snapshot :

Snapshot oracle что это. Смотреть фото Snapshot oracle что это. Смотреть картинку Snapshot oracle что это. Картинка про Snapshot oracle что это. Фото Snapshot oracle что это

Alternatively, you may choose to run the snapshot when you run the first process that uses it.

Snapshot Sharing

Snapshots are shared at the project level. This means that many processes in the same project may use the same snapshot, but processes in different projects may not. If you copy and paste a Snapshot configuration into a new project, this is an independent snapshot, and you will need to run it in order to use the staged data in a process (unless you are streaming data from the data source).

Snapshot Editing/Deletion

Note that if you choose to rename a snapshot, and that snapshot is used in processes, those processes will be invalidated. They will not automatically point at the renamed snapshot. Processes refer to snapshots by name so that you can easily move configurations between servers, where internal IDs would be different.

If required, you can also delete a snapshot using a Right-click menu option. Note that if the snapshot is used by other configuration objects, a warning will be displayed as these objects may be in error.

Note that it is normally best to snapshot all columns, and select those you wish to work with in a given process by configuring the Reader.

No Data Handling

Snapshot Types

There are two types of snapshot:

Server-side snapshots may be reloaded either manually or automatically (for example, as part of a scheduled job) whenever the server has access to the data source. This means that when a process is scheduled for execution, it can automatically rerun the snapshot and pick up any changes in the data if required.

Client-side snapshots are used for sources of data that are accessed via the client rather than the server. For example, the data you wish to work with might be stored on a client machine that does not have a OEDQ host installed (that is, the client accesses an OEDQ host on the network). In this case, the data is copied to the OEDQ host’s repository via connectors on the client.

Canceling Snapshots

Snapshot oracle что это. Смотреть фото Snapshot oracle что это. Смотреть картинку Snapshot oracle что это. Картинка про Snapshot oracle что это. Фото Snapshot oracle что это

Источник

Snapshot oracle что это

Начнем с того, что утверждение, составляющее заголовок, в общем, ложно. Однако существующие диспропорции между эксплуатационными характеристиками локальных сетей и линий связи с ними, отягченные требованиями приложения, вынуждают часто создавать отдельные локальные сервера БД и, более того, дублировать на них часть данных “из центра”. (Это лишь одна причина, по которой тиражирование данных становится единственным техническим решением; есть и другие; но, во-первых, эта причина – важнейшая, а во-вторых, систематичное рассмотрение проблемы не составляет цель этой заметки).

Начальные условия

Пусть имеются две работающие базы данных. Например, их можно запустить на одном компьютере. У себя на занятиях я пользуюсь именами TEACHER и TEACHER1, и эти названия баз данных буду употребляться далее. Для базы TEACHER заведена внешняя связь под названием MYTEACHER. Проще всего завести ее в Net8 Assistant.

Мы намереваемся завести в схеме SCOTT БД TEACHER1 таблицу, являющуюся копией таблицы SCOTT.DEPT базы TEACHER.

Дальнейший текст “проигран” на версии сервера 8.1.6, управляющего обеими базами. Поскольку некоторые детали организации тиражирования в версии 7 отличны от версии 8, в тексте будут сделаны необходимые оговорки.

Одностороннее тиражирование шаг за шагом

Для удобства можно открыть два консольных окошка с SQL*Plus: первое для работы с БД TEACHER, а второе – с TEACHER1.

Теперь нужно посмотреть значение init-параметра job_queue_processes на вашем сервере. Быстрее всего это сделать командой SQL*Plus show parameter job. Если оно 0, то нужно редактором текста проставить в файле INIT.ORA для TEACHER job_queue_processes = 1, сохранить файл и перезапустить систему. (Если все происходит на одной машине, пожалуйста, не ошибитесь с базами!)

Общее пояснение

В однонаправленном тиражировании в Oracle (а именно его мы построили) данные передаются от старшего узла данных (master site) к младшему (snapshot site). (В принципе, никто не мешает и тот, и другой организовать в рамках одной базы).

Для старшей базы выдается команда CREATE SNAPSHOT LOG. По этой команде:

Для младшей базы выдается команда CREATE SNAPSHOT, по которой автоматически создаются:

Таким образом, общая техника выполнения тиражирования становится ясной: в старшей базе создается триггер, заносящий все изменения в тиражируемой таблице в специальную журнальную таблицу, а в младшей базе периодически запускается встроенная процедура, обращающаяся по связи (link) за данными в старшую базу, и вносящая необходимые изменения в реплику.

Маленькое терминологическое отступление

Реплика, то есть таблица, воспроизводящая изменения данных в других таблицах (и, возможно, в базах), в версии 7 Oracle называется snapshot. (Встречаются другие переводы на русский язык, например, буквальный: “фотографический снимок”). В версии 8 это название для совместимости сохранено, однако появилось и более общее: materialized view, “материализованное представление”. Механизм materialized view может много чего другого по сравнению со snapshot, например, делать автоматическую подмену SQL-запроса на другой, более эффективно обрабатываемый, если это возможно. (Только не ожидайте, пожалуйста, прозрачной для понимания технической реализации materialized view!). Всякий (-ая) snapshot является materialized view, но не всякое materialized view является snapshot.

Введя на замену старому новое понятие, фирма Oracle, как часто с ней бывает, не сумела до конца “выкорчевать” первое, и оно регулярно возникает в разных местах – в сообщениях, названии процессов и так далее. Пугаться не надо, к этому приходится привыкнуть.

Комментарий к организации тиражирования

Теперь стоит дать краткий комментарий конкретно относительно проделанных только что действий.

Однонаправленное тиражирование “старший – младший узел” самое простое по организации и сопровождению и не требует наличия в вашей системе Advanced Replication Option. Требуется наличие пакетов DBMS_SNAPSHOT и DBMS_REFRESH, но они обычно устанавливаются при создании базы данных в результате прогона catproc.sql, вызывающего, в свою очередь, dbmssnap.sql и prvtsnap.sql.

Комментарий к начальным условиям

Имена пользователей на старшем и младшем узле вовсе не обязаны совпадать. SCOTT выбран только потому, что этот пользователь всем известен и почти всегда в базе есть (а если нет – создается одной командой в SQL*Plus).

Комментарий к шагу 1

Таблица DEPT выбрана по двум причинам: у нее есть первичный ключ (что в нашем варианте создания реплики обязательно, а вообще-то, необязательно), и она непуста и мала, что удобно для иллюстраций.

Фраза WITH PRIMARY KEY указывает на то, что ссылки из журнала на строки базовой таблицы DEPT будут делаться по ключу. Можно было бы делать и по ROWID, указав WITH ROWID (а в версии 7 это была единственная возможность), но более правильно и надежно (с точки зрения внесения изменений в приложение) организовать в журнале ссылки именно по ключу.

Комментарий к шагу 2

Системные привилегии CREATE SNAPSHOT и CREATE DATABASE LINK входят в состав роли ADM, причем вторая из них – еще и в CONNECT, IMP_FULL_DATABASE и RECOVERY_CATALOG_OWNER. Поэтому не исключено, что они у “вашего” SCOTT уже есть.

Имя создаваемой связи должно совпадать с именем базы данных, с которой налаживается взаимодействие.

Комментарий к шагу 3

В предложении CREATE SNAPSHOT фраза BUILD IMMEDIATE означает, что “первая” реплика будет построена сразу же по выдаче этого предложения.

Фраза REFRESH FAST указывает на то, что реплика будет изменяться путем применения к ней модификаций исходных данных. В противовес этому можно было бы изменять реплику путем полного ее пересоздания с нужной периодичностью. Ясно, то выбранный нами метод изменения более экономен, особенно на больших таблицах.

Фраза START WITH SYSDATE NEXT SYSDATE + 1/1440 говорит о том, что фоновый процесс SNP, заведенный на шаге 2 изменением INIT-параметра, будет привлекаться заданием (job), созданным предложением CREATE SNAPSHOT, для извлечения изменений в базовой таблице и внесения их в реплику, начиная с текущего момента и далее ежеминутно.

Фраза WITH PRIMARY KEY: раз мы ее указали при создании журнальной таблицы, то должны указать и здесь. В прочем см. аналогичный комментарий для шага 2.

По поводу запроса SELECT. Здесь он по виду самый простой, но, во-первых, можно было указать в нем отбор строк по условию WHERE и/или столбцов путем явного перечисления полей. То есть, мы вовсе не обязаны воспроизводить на младшем узле всю исходную таблицу целиком, что составляет большой плюс для разработки приложения. Более того, и во-вторых, мы не обязаны ограничиваться в этом запросе только одной базовой таблицей, и можем извлекать данные из двух и более таблиц. При планировании такого решения, правда, нелишне принять во внимание и проблемы эффективности, и, возможно, подумать над иным вариантом его реализации.

На этом шаге не мешает обратиться к USER_OBJECTS и посмотреть, что нового появилось в вашей схеме (из того, что Oracle8 считает нужным вам показать). Заметим, что DEPTCOPY – это представление, и модифицировать его самим не получится. Это – выводимая таблица только для чтения.

Комментарий к шагу 4

Не забудьте выполнить COMMIT, а то за “минуту” из следующего шага вы успеете не только кофе попить, но и пообедать.

Комментарий к шагу 5

Без комментариев. Впрочем, нет. Если по каким-то причинам процесс обновления реплики не работает (например, не работает связь), то после определенного числа безуспешных попыток задание на обновление реплики будет помечено как “неработающее” (broken) и перестанет активизироваться раз в минуту. Впрочем, это уже выходит за рамки простейшего примера, в котором все должно быть нормально, и здесь начинаются, как говорится, “детали”.

О чем я не рассказал

О массе этих самых “деталей” – следуя назначению этой заметки. Например, о том,

Но если пример, проделанный собственными руками, вас воодушевил, то более плотное изучение вопросов построения однонаправленного тиражирования окажется для вас, чисто психологически, проще.

Если этот текст помог/не помог вам в этой задаче, или если вы просто хотите сообщить мне свои замечания, рекомендации или комментарии, я буду благодарен получить от вас письмо.

Дополнительную информацию Вы можете получить в компании Interface Ltd.

Источник

Snapshots Are NOT Backups

Snapshots Are NOT BackupsComparing Storage-based Snapshot Technologies with Recovery Manager (RMAN) and Fast Recovery Area for Oracle Databases

by Tim Chien, Oracle

Introduction

While storage snapshots are widely used to quickly create point-in-time virtual copies of data, they are also often marketed as valid “backup solutions”. This is an incorrect and dangerous assumption because snapshots, unless copied to secondary media (e.g. another storage array or tape), do not protect against media failures. While there are benefits of using snapshots for development or testing purposes on non-production systems, they should not be considered as valid data protection or backups of Oracle databases. Instead, customers should look to Recovery Manager (RMAN) and Fast Recovery Area (FRA) as the Oracle-supported solution to create and manage Oracle database backups. Note that since RMAN and Fast Recovery Area are built-in features of the Oracle database, this solution also applies to Oracle Exadata Database Machine, with the additional benefit of extremely high performance.

This article provides a comparison of storage-based snapshot technologies with RMAN and Fast Recovery Area backups.

Overview – Recovery Manager (RMAN) and Fast Recovery Area (FRA)

Snapshot oracle что это. Смотреть фото Snapshot oracle что это. Смотреть картинку Snapshot oracle что это. Картинка про Snapshot oracle что это. Фото Snapshot oracle что это

Figure 1 – Oracle Suggested Backup Strategy

Overview – Storage Snapshots

Storage snapshots have offered development and QA capabilities for database and non-database environments for many years, providing the ability to quickly create point-in-time storage-efficient virtual copies of the data. Snapshots do not require an initial copy, as they are not stored as physical copies of blocks, but rather as pointers to the blocks that existed when the snapshot was created. Because of this tight physical relationship, the snapshot is maintained on the same storage array as the original data. Snapshots are generally implemented either as copy-on-write or redirect-on-write-based methods.

In the copy-on-write case, after a snapshot is taken, and upon the first change to a storage block, the array copies the before-change block to a new location on disk, thus maintaining the before-change block for the snapshot and the new block for the active version of the database. In the diagram below, block C is updated, so the old block is copied to a new location, then the new block (C’) is written to the original location.

Snapshot oracle что это. Смотреть фото Snapshot oracle что это. Смотреть картинку Snapshot oracle что это. Картинка про Snapshot oracle что это. Фото Snapshot oracle что это

Figure 2 – Copy-on-Write Storage Snapshot

In the redirect-on-write case, the new block (C’) is directly written to the snapshot storage, as shown in the diagram below. Thus, there are no double writes when a block changes, as in the copy-on-write case, but the active version of the blocks becomes fragmented over time.

Snapshot oracle что это. Смотреть фото Snapshot oracle что это. Смотреть картинку Snapshot oracle что это. Картинка про Snapshot oracle что это. Фото Snapshot oracle что это

Figure 3 – Redirect-on-Write Storage Snapshot

Snapshots have no awareness of the Oracle block structure (as they operate at a storage block-level) and more importantly, they are inherently physically different than backups (consisting of pointers instead of blocks). As a result, there are significant trade-offs that should be considered before using snapshots to provide data protection for the Oracle database.

The following sections provide details on the advantages/disadvantages of the RMAN incrementally updated backups and FRA solution versus storage snapshots.

Oracle Database Protection

As previously mentioned, RMAN in combination with the FRA forms the foundation of Oracle’s recommended backup strategy, involving a one-time image copy backup to the FRA, daily fast incremental backups using RMAN’s block change tracking capability, and regular update of the image copy by applying the incremental backup. When using RMAN to back up the FRA files or the database itself to tape, Oracle Secure Backup provides an Oracle-optimized, RMAN-integrated backup approach, leveraging unused block compression, undo elimination, and shared memory buffers to offer the highest performing database backups to tape. Many leading third party backup vendors have also offered RMAN-integrated tape backup methods over the last several years.

Snapshots, on the other hand, are not designed for Oracle data protection. They have no knowledge of an Oracle block structure, and hence do not and cannot validate Oracle data when they are created. They cannot be used for any data loss or physical corruption scenarios. A block corruption that goes undetected can potentially affect a series of snapshots, if the block does not change over time. Since snapshots reside on the same array as the source database, they are vulnerable to failures that affect the storage array. That is why a snapshot, even though it is created very quickly, does not constitute a backup of the original data. For a snapshot to be used as a valid backup, it must be re-constituted as a full set of blocks to another storage array or to tape, which involves the same performance issues that are characteristic of a full copy. Finally, restoring a snapshot has the side effect of nullifying all snapshots that were taken after it, unless the snapshot is fully restored as a copy of the production data to an alternate location. Given these inherent deficiencies with snapshots, it is evident that only Oracle-aware RMAN backups can offer true data protection «peace of mind».

Database Performance

The RMAN incrementally updated backup method requires an initial image copy backup of the database, i.e. 1X copy of the database minus temp data files. After the full backup is taken, fast incremental backups and the incremental update of the copy are the only required backup operations thereafter. RMAN performs sequential Oracle block I/O reads on the database storage during the backup. Consequently, database performance can potentially be impacted during backups due to the additional I/O consumption. Note that fast incremental backups reduce I/O consumption by only reading the changed blocks relative to the last full or incremental backup – that too, in a highly Oracle optimized manner using RMAN’s block change tracking capability. In addition, the incremental update operation utilizes I/O only on the FRA storage and not on the production database storage.

With respect to copy-on-write- based snapshots, the database performance impact manifests in two ways. First, after a snapshot is created, the first write to a database block translates to two storage I/O writes – one for the copy of the original block to a new snapshot storage location and one for the write of the new block over the original block. The increased I/O usage can have a severe impact on production database performance. Secondly, after reverting the production database volume to a previous snapshot, the now-active version of the storage blocks includes references to the snapshot blocks, which are likely to be fragmented across the disk instead of being sequentially laid out (which the database still expects when I/O is issued). For example, in the previous diagram of the copy-on-write snapshot, an I/O request for block C is redirected to the snapshot version of block C, while I/O requests for block B are not redirected, since it did not change relative to the time the snapshot was taken. When the database issues a 1 MB I/O, instead of reading the data sequentially in a single large read, it will issue 128 random I/Os (assuming 8K block size). As multiple snapshots are created and restored over time, the resulting fragmented block layout can result in a potentially 10-100X slowdown in database performance.

Because of these reasons, it is never a good idea to create and use snapshots on production database storage. Snapshots, if used for development and QA purposes, should be created on secondary copies of data which do not support production workload. Oracle’s High Availability (HA) Development group has published a highly efficient way to achieve this, using Oracle Data Guard and the ZFS Storage Appliance – see this white paper for more details.

Database Backup & Restore Performance

As previously discussed, the RMAN incrementally updated backup method requires an initial image copy backup, then incremental backups and incremental updates to the copy thereafter. Thus, the initial backup time is proportional to the size of the database and backup times thereafter are proportional to the volume of changed blocks between incrementals. If a copy needs to be preserved to satisfy the retention policy before being incrementally updated, RMAN can backup the copy to tape. Backing up the copy and other FRA files to tape also allows disk space to be automatically reclaimed by the FRA when additional space is needed for new files. When a recovery is needed, the full copy can either be restored to the production database storage, or used directly as the production data files via the RMAN SWITCH command (i.e. restore-free recovery). The restored data files are then recovered to a consistent point-in-time via the redo apply process.

For example, if datafile 4 is accidentally deleted or severely corrupted, the DBA can use these simple RMAN commands to quickly switch to the copy of the datafile maintained in the FRA and make it consistent with the rest of the database, without impacting the rest of the database and without needing to do any time- consuming restore operation:

Database Cloning

Snapshot-based clones, on the other hand, can be created near-instantly and occupy a fraction of the production database storage, depending on the storage block change pattern. Just as copy-on-write methods are used to create snapshots, the same methods are used to create snapshot-based clones. The snapshot clone physically occupies space equivalent to the volume of unique blocks that have changed, since the clone was created and not proportional to the database size itself. However, just as in the case of snapshots, there is additional database I/O impact due to copy-on-write – this impact is exacerbated for writable snapshot clones, where the clone database block changes are also tracked via copy-on-write. Because of the severe I/O performance impact, snapshot clones are not recommended to be utilized on the production database, but on a secondary copy of the database.

Summary

Storage-based snapshot technologies serve a different purpose compared to backup and data protection solutions. Since snapshots reside on the same array as the production database, they are vulnerable to array failures and thus should not be considered valid «backups» of the data. Snapshots can be effectively utilized for development/QA/test activities on a secondary copy of the production database, but should not be utilized on the production database itself due to the severe I/O impact of copy-on-write. For Oracle database backups, customers should leverage RMAN and Fast Recovery Area, along with Oracle Secure Backup for integrated tape backups, to provide complete data loss and corruption protection.

Snapshot oracle что это. Смотреть фото Snapshot oracle что это. Смотреть картинку Snapshot oracle что это. Картинка про Snapshot oracle что это. Фото Snapshot oracle что это

Tim Chien is a Principal Product Manager with the Oracle Database High Availability Development team, focusing on backup and recovery.

Источник

3
Snapshot Concepts & Architecture

This chapter explains the concepts and architecture of Oracle Snapshots. This chapter covers the following topics:

Snapshot Concepts

To learn more about materialized views for data warehousing, see the Oracle8i Tuning book.

What is a Snapshot?


Figure 3-1 Snapshot Connected to a Single Master Site in a Replicated Environment

Snapshots also have the option of containing a WHERE clause so that snapshot sites can contain custom data sets, which can be very helpful for regional offices or sales forces that don’t require the complete corporate data set.

Why use Snapshots?

Oracle offers a variety of snapshots to meet the needs of many different replication (and non-replication) situations; each of these snapshots will be discussed in detail in following sections. You might use a snapshot to achieve one or more of the following:

Ease Network Loads

If one of your goals is to reduce network loads, you can use snapshots to distribute your corporate database to regional sites; instead of the entire company accessing a single database server, user load is distributed across multiple database servers.

While multimaster replication also distributes a corporate database to multiple sites, the networking requirements are greater than replicating with snapshots because of the transaction by transaction nature of multimaster replication. Since multimaster replication can provide real-time or near real-time results, network traffic is much greater, resulting in the need for a dedicated network link.

Snapshots are updated via an efficient batch process from a single master site and have less network requirements and dependency than multimaster replication because of the point-in-time nature of snapshot replication. In addition to not requiring a dedicated network connection, replicating data with snapshots increases data availability by providing local access to the target data. These benefits, combined with mass deployment and data subsetting (both of which also reduce network loads), will greatly enhance the performance and reliability of your replicated database.

Mass Deployment

Deployment templates allow you to precreate a snapshot environment locally. Deployment templates allow you to quickly and easily deploy snapshot environments to support sales force automation and other mass deployed environments. Parameters allow you to create custom data sets for individual users without changing the deployment template. This technology allows you to rollout a database infrastructure to hundreds or thousands of users.

Data Subsetting

Snapshots allow you to replicate data based on column and/or row-level subsetting (remember that multimaster replication requires replication of the entire table). Data subsetting allows you to replicate information that only pertains to a particular site. For example, if you have a regional sales office, you might replicate only the data that is needed in that region, thereby cutting down on unnecessary network traffic.

Disconnected Computing

Unlike multimaster replication, snapshots do not require a dedicated network link. Though you have the option of automating the refresh process by scheduling a job, you can manually refresh your snapshot on-demand. This is an ideal solution for sales applications running on a laptop. For example, a developer can integrate the Oracle Replication Management API to refresh on-demand into the sales application. When the sales person has completed the day’s order, they simply dial-up the network and use the integrated mechanism to refresh the database, thus transferring the orders to the main office.

Available Snapshots

As previously mentioned, there are several types of snapshots available to meet a variety of distributed database needs. The following sections describe each snapshot and also describe some environments for which they are best suited.

Primary Key

Primary key snapshots are considered the normal (default) type of snapshot. Primary key snapshots are updateable if the snapshot has been created as part of a snapshot group (see «Snapshot Groups» ) and «FOR UPDATE» was specified when defining the snapshot. Changes are propagated according to the row changes as identified by the primary key value of the row (vs. the ROWID). The SQL command for creating an updateable, primary key snapshot might look like:

Primary key snapshots may contain a subquery so that you can create a horizontally partitioned subset of data at the remote snapshot site. This subquery may be as simple as a basic WHERE clause or as complex as a multilevel WHERE EXISTS clause. Primary key snapshots that contain a selected class of subqueries can still be incrementally or fast refreshed (see «Snapshots with Subqueries» for more information). The following is a subquery snapshot with a WHERE clause containing a subquery:

ROWID

For backwards compatibility, Oracle supports ROWID snapshots in addition to the default, primary key snapshots. A ROWID snapshot is based on the physical row identifiers (ROWIDs) of the rows in a master table. ROWID snapshots should be used only for snapshots based on master tables from an Oracle7 database, and should not be used when creating new snapshots based on master tables from Oracle8 or greater databases (see «Snapshot Log» for more information on the differences between a ROWID and Primary Key snapshot).

Complex

If your snapshot does not need to be fast refreshable, then you can create a complex snapshot that allows for a defining SELECT statement that might contain an aggregate or a set operation. Specifically, a snapshot is considered complex when the defining query of a snapshot contains:

In most cases, you should avoid using complex snapshots since they cannot be fast refreshed, which may degrade network performance (see «Refresh Process» for information).

A sample complex snapshot CREATE statement might look like the following:

A Comparison of Simple and Complex Snapshots

For certain applications, you might want to consider the use of a complex snapshot. Figure 3-2 and the following text discuss some issues that you should consider.

Figure 3-2 Comparison of Simple and Complex Snapshots

In summary, to decide which method to use:

Read-only Snapshots

Any of the previously described types of snapshots can be made read-only by omitting the FOR UPDATE clause (or disabling the equivalent checkbox in the Replication Manager GUI interface). Read-only snapshots use many of the same mechanisms as updateable snapshots, except that they do not need to belong to a snapshot group (see «Snapshot Groups» for more information).

In addition to not needing to belong to a snapshot group, using read-only snapshots eliminates introducing data conflicts originating from a remote snapshot site, though this convenience means that updates cannot be made at the remote snapshot site. You might define a read-only snapshot as:

Data Subsetting with Snapshots

In certain situations, you will want your snapshot to reflect a horizontally or vertically partitioned segment of the master table’s data. If you use deployment templates to build your snapshots, you can define vertical data subsets to replicate data along column boundaries (for additional information on vertical partitioning, see «Vertical Partitioning» ). Some reasons to consider partitioning data are:

In many instances, the above objectives can be met by using a simple WHERE clause. For example, the following DDL creates a snapshot that contains information about customers who are in the 19555 zip code:

Snapshots with Subqueries

The above example works very well for individual snapshots that don’t have any referential constraints to other snapshots. But, if you want more than just the customer information, maintaining and defining these snapshots could be difficult.

The Customer snapshot has a very simple defining query since the Customer master table is at the top of the hierarchy:

When you create the Orders snapshot, you want to retrieve all of the orders for the customers located in the 19555 zip code. If you look at the relationships in Figure 3-3, you will notice that the Customer and Orders table are related via the C_ID column. The following DDL will create the Orders snapshot with the appropriate data set:

Figure 3-3 Advanced subsetting with a subquery.

Creating the Order_line snapshot uses the same approach as the Orders snapshot, except that you have one additional subquery. Notice in Figure 3-3 that the Order_line and the Order tables are related via the O_ID row. The following DDL will create the Order_line snapshot with the appropriate data set:

The snapshots created by these three DDL statements are each fast refreshable. If new customers are identified in the target zip code, the new data will be propagated to the snapshot site during the subsequent refresh process. Likewise, if a customer is removed from the target zip code, the appropriate data will also be removed from the snapshot during the subsequent refresh process.

The subqueries in these snapshot examples walk-up the many-to-one references from the child to the parent tables. The snapshots will be populated with data that satisfies the defining query for each of these snapshots and will be refreshed only with data that satisfies these defining queries.

Using Assignment Tables

While the previous examples greatly enhance the flexibility of snapshots, there are certain limitations in the above example. Consider if the salesperson changed territories or the existing territory was assigned an additional zip code; the above snapshot definitions would need to be altered or recreated since the zip code 19555 was «hard coded» in the previous snapshot definitions.

With this in mind, if «assignment» tables are used in conjunction with subquery subsetting, changes to a snapshot environment can easily be controlled by the DBA. For example, consider the customer/salesperson relationship in Figure 3-4.

In this example, a salesperson is assigned their customers based on the Assignment table. If new salespersons are hired or other salespersons leave, the existing customers can be assigned to their new salesperson by simply modifying the contents of the assignment table. Besides creating a single point of administration, assignment tables used in conjunction with subquery subsetting enables this easy administration to remain secure. For example, salesman # 1001 will not be able to view the customer information of other salesmen (very important if the customer information contains sensitive data).

Figure 3-4 Customer/Salesperson Relationship

Considering the relationships pictured in Figure 3-4, if the Orders snapshot’s defining query was specified as (pay special attention to the ‘ gsmith ‘ value in the last line of the CREATE SNAPSHOT statement):

then the Orders snapshot will be populated with order data for the customers that are assigned to salesperson ‘ gsmith ‘.

With this flexibility, managers can easily control snapshot data sets by making simple changes to the assignment table (without requiring a DBA to modify any SQL). For example, if the specified salesperson was assigned two new customers, the manager would simply assign these two new customers to the salesperson in the assignment table. After the next fast snapshot refresh, the data for these two customers will be propagated to the target snapshot site, such as the salesperson’s laptop (see «Refresh Types» for more information). Conversely, if a customer was taken away from the specified salesperson, all data pertaining to the specified customer would be removed from the snapshot site after the next refresh and the salesperson would no longer be able to access that information.

Restrictions for Snapshots with Subqueries

Snapshots with a subquery must be of the primary key type (see «Primary Key» for more information about primary key snapshots). Additionally, the defining query of a snapshot with a subquery is subject to several other restrictions to preserve the snapshot’s fast refresh capability.

Note: To determine whether a snapshot’s subquery satisfies the many restrictions detailed in Table 3-1, create the snapshot with «fast refresh». Oracle will return errors if the snapshot violates any restrictions for simple subquery snapshots. If you specify «force refresh,» you may not receive any errors because Oracle will automatically send data for a complete refresh.

Table 3-1 Restrictions for Snapshots with Subqueries

Snapshot Architecture

The mechanisms used in snapshot replication are depicted in Figure 3-5. Some of these mechanisms are optional and are used only as needed to support the created snapshot environment. For example, if you have a read-only snapshot, then you will not have an updatable snapshot log or internal trigger at the remote site. Also, if you have a complex snapshot that cannot be fast refreshed, then you may not have a snapshot log at the master site.

Figure 3-5 Snapshot Replication Mechanisms


Master Site Mechanisms

The three mechanisms displayed in Figure 3-6 are required at the master site to support fast refreshing of snapshots.

Figure 3-6 Master Site Mechanisms


Master Table

The master table is the basis for the snapshot and is located at the target master site. This table may be involved in both snapshot replication and multimaster replication (remember that a snapshot points to only one master site).

Changes made to the master table as recorded by the snapshot log will be propagated to the snapshot during the refresh process.

Internal Trigger

When changes are made to the master table using DML, an internal trigger records the primary key and/or ROWID of the rows affected and filter column 1 information in the snapshot log. This is an internal trigger that is automatically activated when you create a snapshot log for the target master table.

Snapshot Log

As described in the previous section, the internal trigger adds the information to the snapshot log whenever a DML transaction has taken place at the target master table.

There are three types of snapshot logs:

A combination snapshot log works in the same manner as the primary key and ROWID snapshot log, except both the primary key and the ROWID of the affected row are recorded.

Though the difference between snapshot logs based on primary keys and ROWIDs is very small (one records rows affected using the primary key, while the other records affected rows using the physical ROWID), the practical impact is quite large. Using ROWID snapshots and snapshot logs will make reorganizing and/or truncating your master tables very difficult as it will prevent your ROWID snapshots from being FAST refreshed. If you reorganize or truncate your master table, your ROWID snapshot will need to be COMPLETE refreshed since the ROWIDs of the master table will have changed.

Snapshot Site Mechanisms

When a snapshot is created, several additional mechanisms are created at the snapshot site to support the snapshot. Specifically, a base table, at least one index, and optionally a view are created. If you create an updateable snapshot, an internal trigger and a local log (updateable snapshot log) are also created at the snapshot site.

Base Table

Beginning with Oracle8 i release 8.1.5, the base table is the actual snapshot (no view is required). The base table will have the name that you have specified during the creation process.

Any indexes generated when you create the snapshot are created on the base table.

Oracle supports snapshots of master table columns that use the following datatypes: NUMBER, DATE, VARCHAR2, CHAR, NVARCHAR2, NCHAR, RAW, ROWID.

Oracle also supports snapshots of master table columns that use the following large object types: binary LOBs (BLOBs), character LOBs (CLOBs), and national character LOBs (NCLOBs). However, you cannot reference LOB columns in a WHERE clause of a snapshot’s defining query. The deferred and synchronous remote procedure call mechanism used for replication propagates only the piece-wise changes to the supported LOB datatypes when piece-wise updates and appends are applied to these LOB columns.

Note: Oracle8 i does not support replication of LOB datatypes in replication environments where some sites are running Oracle7 release 7.3.

Oracle does not support the replication of columns that use the LONG datatype. Oracle simply omits the data in LONG columns from snapshots.

Index


Local Log

A local update log (USLOG$_ Materialized_View_Name ) is used to determine what data needs to be pulled from the target master table. A read-only snapshot does not require this local log.

Internal Trigger

Just like the internal trigger at the master site, the internal trigger at the snapshot site records DML changes applied to an updateable snapshot in the USLOG$_ Snapshot_Name log.

Organizational Mechanisms

In addition to the snapshot mechanisms described in the previous section, there are several additional mechanisms that organize the snapshots at the snapshot site. These mechanisms maintain organizational consistency between the snapshot site and the master site as well as transactional (read) consistency with the target master group.

Snapshot Groups

A snapshot group in an advanced replication system maintains a partial or complete copy of the objects at the target master group (snapshot groups cannot span across master group boundaries). Figure 3-7 displays the correlation between Groups A and B at the master site and Groups A and B at the snapshot site.

Figure 3-7 Snapshot Groups Correspond with Master Groups

Group A at the snapshot site (Figure 3-7) contains only some of the objects in the corresponding Group A at the master site. Group B at the snapshot site contains all objects in Group B at the master site. Under no circumstances, however, could Group B at the snapshot site contain objects from Group A at the master site. As illustrated in Figure 3-7, snapshot groups are named the same as the master groups that the snapshot group is based on. For example, a snapshot group based on a «PERSONNEL» master group will also be named «PERSONNEL.»

In addition to maintaining organizational consistency between snapshot sites and master sites, snapshot groups are required for supporting updateable snapshots. If a snapshot does not belong to a snapshot group, then it can only be a read-only snapshot.

Using a Group Owner

If you need to support multiple users within the same database at a snapshot site, you may want to create multiple snapshot groups for the target master group. This enables you to define different subqueries for your snapshot definitions in each snapshot group, allowing each user to access only his or her subset of data.

Defining multiple data sets with different snapshot groups is more secure than defining different WHERE clauses for multiple views supporting different users. Since you can grant users access to individual snapshot objects, you can control what the user views, deletes, and inserts; with a WHERE clause in a view, you can only control what a user views, but not the deleting or inserting of data.

Defining multiple snapshot groups gives you the ability to control data sets at a group level. For example, if you create different snapshot groups for the HR, PERSONNEL, and MANUFACTURING departments, you can administer each department as a group (versus individual objects). For example, you can refresh the snapshots as a departmental group or you can drop the objects as a group.

With respect to dropping a department, if you group all data sets into a single snapshot group and the MANUFACTURING department needs to be removed from the data set, you will need to drop and re-create the snapshot with a WHERE clause that does not contain the MANUFACTURING department. In addition to causing you additional work, it could disrupt other departments from accessing their data. Compartmentalizing your data into separate groups allows you to efficiently manage the data since changes to one group will not affect another group.

To accommodate multiple snapshot groups at the same snapshot site that are based on a single master group, you can specify a group owner as an additional identifier when defining your snapshot group.

After you have defined your snapshot group with the addition of a group owner, you add your snapshot objects to the target snapshot group by defining the same group owner. When using a group owner, remember that each snapshot object must have a unique name. If a single snapshot site will have multiple snapshot groups based on the same master group, a snapshot group’s object names cannot have the same name as snapshot objects in another snapshot group. To avoid conflicting names, you might want to append the group owner name to the end of your object name. For example, if you have group owners «HR» and «PERSONNEL», you might name the «EMP» snapshot object as «EMP_HR» and «EMP_PERSONNEL,» respectively.

Additionally, all snapshot groups that are based on the same master group at a single snapshot site must «point» to the same master site. For example, if the SCOTT_MG snapshot group owned by HR is based on the associated master group at the ORC1.WORLD master site, then the SCOTT_MG snapshot group owned by PERSONNEL must also be based on the associated master group at ORC1.WORLD, assuming that the HR and PERSONNEL owned groups are at the same snapshot site.

See the «Using a Group Owner» section in Chapter 7 of the Oracle8i Replication API Reference manual for more information on defining a group owner using the replication management API.

Refresh Groups

To preserve referential integrity and transactional (read) consistency among multiple snapshots, Oracle has the ability to refresh individual snapshots as part of a refresh group. After refreshing all of the snapshots in a refresh group, the data of all snapshots in the group will correspond to the same transactionally consistent point-in-time.

As illustrated in Figure 3-8, a refresh group can contain snapshots from more than one snapshot group to maintain transactional (read) consistency across master group boundaries.

Figure 3-8 Refresh Groups May Contain Objects from Multiple Snapshot Groups

While you may want to define a single refresh group per snapshot group, it may be more efficient to use one large refresh group that contains objects from multiple snapshot groups (such a configuration reduces the amount of «overhead» needed to refresh your snapshots). A refresh group can contain up to 400 snapshots (the number of snapshots that a refresh group can contain has increased from earlier versions of Oracle Server).

One configuration that you want to avoid is using multiple refresh groups to refresh the contents of a single snapshot group. Using multiple refresh groups to refresh the contents of a single snapshot group may introduce inconsistencies in the snapshot data, which may cause referential integrity problems at the snapshot site. This type of configuration should only be used when you have in-depth knowledge of the database environment and can prevent any referential integrity problems.

Refresh Process

A snapshot’s data does not necessarily match the current data of its master table. A snapshot is a transactionally (read) consistent reflection of its master table as the data existed at a specific point-in-time (i.e. at creation or at a refresh interval). To keep a snapshot’s data relatively current with the data of its master table, the snapshot needs to be periodically refreshed. A snapshot refresh is an efficient batch operation that makes that snapshot reflect a more current state of its master table.

You must decide how and when to refresh each snapshot to make it more current. For example, snapshots based on master tables that applications update often require frequent refreshes. In contrast, snapshots based on relatively static master tables usually require infrequent refreshes. In summary, you must analyze application characteristics and requirements to determine appropriate snapshot refresh intervals.

To refresh snapshots, Oracle supports several refresh types and methods of initiating a refresh.

Refresh Types

Oracle can refresh a snapshot using either a FAST, COMPLETE, or FORCE refresh.

Complete Refreshes

To perform a complete refresh of a snapshot, the server that manages the snapshot executes the snapshot’s defining query. The result set of the query replaces the existing snapshot data to refresh the snapshot. Oracle can perform a complete refresh for any snapshot. Depending on the amount of data that satisfies the defining query, a complete refresh can take a substantially longer amount of time to perform than a fast refresh.

If a snapshot is completely refreshed, set its PCTFREE to 0 and PCTUSED to 100 for maximum efficiency.

Datatype Considerations for Snapshots

Fast Refreshes

If a fast refresh is not possible, an error is raised and the snapshot(s) will not be refreshed.

Figure 3-9 Fast Refresh of a Snapshot


Force Refreshes

To perform a force refresh of a snapshot, the server that manages the snapshot first tries to perform a fast refresh. If a fast refresh is not possible, then Oracle performs a complete refresh. Use the Force setting when you want the snapshot to refresh if the fast refresh fails.

Initiating a Refresh

When creating a refresh group, administrators may configure the group so that Oracle can automatically refresh its snapshots at scheduled intervals. Conversely, administrators may omit scheduling information so that the refresh group needs to be refreshed manually or «on-demand» (manual refreshing is an ideal solution when refreshing is performed with a dial-up network connection).

Scheduled Refresh

When you create a refresh group for scheduled refreshing, you must specify a scheduled refresh interval for the group during the creation process. When setting a group’s refresh interval, consider the following characteristics:

On-demand Refresh

Scheduled snapshot refreshes may not always be the appropriate solution for your environment/situation. For example, immediately following a bulk data load into a master table, dependent snapshots will no longer represent the master table’s data. Rather than wait for the next scheduled automatic group refreshes, you might want to manually refresh dependent snapshot groups to immediately propagate the new rows of the master table to associated snapshots.

You may also want to refresh your snapshots on-demand when your snapshots are integrated with a sales force automation system located on a disconnected laptop. Developers designing the sales force automation software can create an application control (i.e. a button) that a sales person can use to refresh the snapshots when they are ready to transfer the day’s orders to the server after establishing a dial-up network connection.

Prepare for Snapshots

Most problems encountered with snapshot replication come from not preparing the environment properly. There are four essential tasks that you must perform before you begin creating your snapshot environment: create the necessary schema, create the necessary database links, assign the appropriate privileges, and allocate sufficient job processes.

Oracle’s Replication Manager setup wizard automatically performs the tasks that are described below. The following is provided for your understanding of the replication environment and especially for those that use the Replication Management API. After the setup wizard is executed, you need to make sure to create the necessary snapshot logs (see «Create a Snapshot Log» ). See «Setting Up Snapshot Site» for instructions on using Replication Manager to setup your snapshot site. You are encourage to use Replication Manager whenever possible.

If you are not able to use Replication Manager, review the «Setup Snapshot Site» section in chapter 2 of the Oracle8i Replication API Reference for detailed instructions on setting up your snapshot site using the Replication Management API.

The following sections describe what the Replication Manager setup wizard or the script in the Oracle8i Replication API Reference will do to setup your snapshot site.

Create Snapshot Site Users

Each snapshot site needs several users to perform the administrative and refreshing activities at the snapshot site. You will need to create and grant the necessary privileges to the snapshot administrator and to the refresher.

Create Master Site Users

You will need equivalent proxy users at the target master site to perform tasks on behalf of the snapshot site users. Usually, a proxy snapshot administrator and a proxy refresher will be created.

Schema

A schema containing a snapshot in a remote database should correspond to the schema that contains the master table in the master database. Therefore, identify the schemas that contain the master tables that you want to replicate with snapshots. Once you have identified the target schemas at the master database, create the corresponding accounts with the same names at the remote database. For example, if all master tables are in the SALES schema of the DB1 database, create a corresponding SALES schema in the snapshot database DB2. (If you are reviewing the steps in Oracle8i Replication API Reference manual, the necessary schema(s) are created as part of the script described in chapter 5.)

Database Link

The defining query of a snapshot may use one or more database links to reference remote table data. Before creating snapshots, the database links you plan to use must be available. Furthermore, the account that a database link uses to access a remote database defines the security context under which Oracle creates and subsequently refreshes a snapshot.

To ensure proper behavior, a snapshot’s defining query must use a database link that includes an embedded user name and password in its definition; you cannot use a public database link when creating a snapshot. A database link with an embedded name and password always establishes connections to the remote database using the specified account. Additionally, the remote account that the link uses must have the SELECT privileges necessary to access the data referenced in the snapshot’s defining query.

Before creating your snapshots, you need to create several administrative database links. Specifically, you should create a PUBLIC database link from the snapshot site to the master site (this makes defining your private database links easier since you don’t need to include the USING clause in each link). You will also need private database links from the snapshot administrator to the proxy administrator and from the propagator to the receiver (if you use the Replication Manager Setup Wizard, these database links will be created for you). See «Security Setup for Snapshot Replication» for more information on snapshot users and database links. Additionally, see Chapter 2 of the Oracle8i Replication API Reference manual.

After the administrative database links have been created, a private database link needs to be created connecting each replicated snapshot schema at the snapshot database to the corresponding schema at the master database. Be sure to embed the associated master database account information in each private database link at the snapshot database. For example, the SALES schema at a snapshot database DB2 should have a private database link DB1 that connects using the SALES username and password.

Figure 3-10 Recommended Schema and Database Link Configuration


Privileges

Both the creator and the owner of (schema that contains) the snapshot must be able to issue the defining SELECT statement of the snapshot. If a user other than the replication or snapshot administrator will be creating the snapshot, then that user must have the CREATE snapshot privilege and the appropriate SELECT privileges to execute the defining SELECT statement. (If you are reviewing the steps in Oracle8i Replication API Reference manual, the necessary privileges are granted as part of the script described in chapter 5.)

Schedule Purge at Master Site

In order to keep the size of the deferred transaction queue in check, you need to schedule a purge operation to remove all successfully completed deferred transactions from the deferred transaction queue. This operation may have already been performed at the master site; re-scheduling the purge operation will not harm the master site, but may change the purge scheduling characteristics.

Schedule Push

Often referred to as a scheduled link, scheduling a push at the snapshot site will automatically propagate and deferred transactions at the snapshot site to the associated target master site. Typically, there will only be a single scheduled link per snapshot group at a snapshot site (since a snapshot group only has a single target master site).

SNP Background Processes and Interval

It is important that you have allocated sufficient SNP background processes to handle the automatic refreshing of your snapshots. Since your snapshot site will typically have only a single scheduled link to the target master site, the snapshot site will only require a single SNP process, but to handle additional activity, you may want to allocate at least two SNP processes at the snapshot site.

The SNP processes are defined using the job_queue_processes parameter in the init.ora file for your database. To set your SNP processes, you can either use Instance Manager, a component of Oracle Enterprise Manager, or manually edit the init.ora file.

The SNP job interval determines how often your SNP processes «wake-up» to execute any pending operations (such as pushing a queue). While the default value of 60 seconds is adequate for most replicated environments, you may need to adjust this value to maximize performance for your individual requirements. For example, if you want to propagate changes to the target master site every 20 seconds, a job interval of 60 seconds would not be sufficient. On the other hand, if you need to propagate your changes once a day, you may only want your SNP process to check for a pending operation once an hour.

Instance Manager

You will often use Instance Manager to configure the SNP processes and interval at the snapshot site if you have a dedicated network link to the snapshot site or you are able to schedule the network link. This is required because Instance Manager will, in most cases, not be at the snapshot site and thus the configuration will need to be done remotely from the master site (if remote configuration is not possible, see the next section).

Figure 3-11 Use Instance Manager to configure the amount of job processes.

Complete the following to set your job processes using Instance Manager (see the Oracle Enterprise Manager Administrator’s Guide for more information using Instance Manager to configure your database):

You will have the opportunity to save this configuration; this is helpful if you use Instance Manager to manage your database. See the Oracle Enterprise Manager Administrator’s Guide and/or online documentation for more information on using Instance Manager.

Manually Edit INIT.ORA

If you do not have access to Instance Manager, you can manually edit the init.ora file. Use a text editor, such as Notepad, EMACS, or vi (depending on your operating system), to modify the contents of your init.ora file.

In most cases, you will see all of the parameters used in replication grouped together under an Oracle replication heading in your init.ora file.

Figure 3-12 Use Notepad to edit your init.ora file in a Windows environment.

After you have modified the contents of your init.ora file, you will need to restart your database with these new settings (see Oracle8i Administrator’s Guide for information on restarting your database).

Create a Snapshot Log

Before creating snapshot groups and snapshots for a remote snapshot site, make sure to create the necessary snapshot logs at the master site. A snapshot log is necessary for every master table that supports at least one snapshot with fast refreshes.

To create a snapshot log at the master site:

You can optionally press the Create Snapshot Log button on the toolbar.

If your snapshot log needs to support both Row ID and Primary Key snapshots, be sure that you enable both the Row ID and the Primary Key checkboxes.

See the following section, «Using Filter Columns» for more information on filter columns.

Press the Help button to see additional information about the available storage settings.

Using Filter Columns

Filter columns are an essential component when using subquery snapshots (see «Data Subsetting with Snapshots» for more information). A filter column must be defined in a snapshot log (see «Create a Snapshot Log» ) that is supporting a snapshot that references a column in a WHERE clause and is not part of the equijoin (see «Restrictions for Snapshots with Subqueries» for additional information).

Consider the following DDL:

If you pay close attention to line 5 of the above DDL, you will see that three columns are referenced in the WHERE clause. Columns o.c_id and c.c_id are referenced as part of the equijoin clause; the column zip is an additional filter column. You will therefore need to create a filter column in the snapshot log for the zip column of the sales.customers table.

You are encouraged to analyze the defining queries of your planned snapshot(s) and identify which filter columns will need to be created in your snapshot log(s). If you try to create or refresh a snapshot that requires a filter column before creating the snapshot log containing the filter column, your snapshot creation or refresh may fail.

Create Snapshot Environment

Snapshot environments can be created in several different ways and from several different locations. In most cases, you will want to use deployment templates to locally pre-create a snapshot environment that will be individually deployed to the target snapshot site.

You can also individually create the snapshot environment by establishing a connection to the snapshot site and building the snapshot environment directly.

Replication Manager

See Chapter 4, «Creating Snapshots with Deployment Templates» for information on using deployment templates to centrally create a snapshot environment using Replication Manager.

See Chapter 5, «Directly Create Snapshot Environment» for information on individually creating the snapshot environment with a direct connection to the remote snapshot site using Replication Manager.

Figure 3-13 Creating Snapshot Process


Replication Management API

See Chapter 4 of the Oracle8i Replication API Reference manual for information on using deployment templates to centrally pre-create a snapshot environment using the Replication Management API.

See Chapter 5 of the Oracle8i Replication API Reference manual for information on individually creating the snapshot environment with a direct connection to the remote snapshot site using the Replication Management API.

1 Filter columns are required when the snapshot contains a subquery. See «Data Subsetting with Snapshots» for information on subquery snapshots and «Using Filter Columns» for more information.

Источник

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *