One of the common challenges in operational and testing environments is extracting a portion of database data for analysis, migration, or testing. Oracle Data Pump (expdp
) provides a feature called SAMPLE
, which helps extract a subset of data from one or multiple tables without copying the entire dataset.
In this article, we will explore the correct use of SAMPLE
in expdp
in two different scenarios and address issues related to log tables, foreign keys, and constraints.
What is SAMPLE
and How Does It Work?
The SAMPLE
parameter in expdp
enables random extraction of a percentage of data from each table. This method is efficient, but there are important considerations:
SAMPLE
works at the data block level, not at the row level.- It does not guarantee that exactly the specified percentage of data will be retrieved.
- It applies to all tables by default but can be limited to specific tables.
- It works with
BLOB
andCLOB
data types, but some values may not be retained.
To better understand its functionality, we will create a large table in the HR
schema, simulating log tables in operational schemas. Typically, schemas contain both core tables and log tables, where log tables occupy most of the database storage.
Creating a Large Table for Testing
The following commands create a large table in the HR
schema, simulating a log table with a high volume of data:
sys@orcl(41)> CREATE TABLE hr.log_table AS SELECT * FROM dba_extents;
Table created.
sys@orcl(41)> INSERT INTO hr.log_table SELECT * FROM hr.log_table;
7888 rows created.
sys@orcl(41)> /
15776 rows created.
sys@orcl(41)> /
31552 rows created.
sys@orcl(41)> /
63104 rows created.
sys@orcl(41)> /
126208 rows created.
sys@orcl(41)> /
252416 rows created.
sys@orcl(42)> SELECT COUNT(*) FROM "HR"."LOG_TABLE";
COUNT(*)
----------
1009664
Now, let's explore two different scenarios where SAMPLE
can be useful.
Scenario 1: Exporting 10% of All Tables in HR
Schema
The following expdp
command exports 10% of all tables in the HR
schema:
expdp directory=oracle_base dumpfile=sample_10_percent.dmp logfile=sample_10_percent.log schemas=hr sample=10
Export Process Output:
. . exported "HR"."LOG_TABLE" 5.908 MB 101087 rows
. . exported "HR"."EMPLOYEES" 10.17 KB 9 rows
. . exported "HR"."LOCATIONS" 7.320 KB 2 rows
. . exported "HR"."JOB_HISTORY" 6.843 KB 0 rows
. . exported "HR"."JOBS" 6.445 KB 1 rows
. . exported "HR"."DEPARTMENTS" 6.523 KB 2 rows
. . exported "HR"."COUNTRIES" 5.937 KB 2 rows
. . exported "HR"."REGIONS" 5.476 KB 0 rows
Result: Only about 10% of the data from each table is exported.
Handling Foreign Key Dependencies During Import (impdp
)
During the import process, foreign key constraints may fail due to missing parent keys.
To demonstrate, we rename the schema to hrtest2
before import:
impdp directory=oracle_base dumpfile=sample_10_percent.dmp logfile=imp_sample_10_percent.log remap_schema=hr:hrtest2
Import Process Output with Errors:
ORA-02298: cannot validate (HRTEST2.DEPT_LOC_FK) - parent keys not found
ORA-02298: cannot validate (HRTEST2.EMP_MANAGER_FK) - parent keys not found
ORA-02298: cannot validate (HRTEST2.EMP_JOB_FK) - parent keys not found
Problem: Many constraints fail because SAMPLE=10
only exports a portion of each table, which can break parent-child relationships in foreign keys.
✅ Solution: Disable Constraints During Import
To avoid these issues, exclude constraints during import:
impdp directory=oracle_base dumpfile=sample_10_percent.dmp logfile=imp_sample_10_percent.log remap_schema=hr:hrtest2 exclude=CONSTRAINT
Note: This solution works, but it may not be ideal for production-like environments that require constraints to be enforced.
Scenario 2: Exporting 10% Data from Only Log Tables
A better approach is to export only log tables with SAMPLE
, while keeping core tables intact.
expdp directory=oracle_base dumpfile=sample_one_table.dmp logfile=sample_one_table.log schemas=hr sample=HR.LOG_TABLE:10
Export Process Output:
. . exported "HR"."LOG_TABLE" 5.885 MB 100707 rows
. . exported "HR"."EMPLOYEES" 17.08 KB 107 rows
. . exported "HR"."LOCATIONS" 8.437 KB 23 rows
. . exported "HR"."JOB_HISTORY" 7.195 KB 10 rows
. . exported "HR"."JOBS" 7.109 KB 19 rows
. . exported "HR"."DEPARTMENTS" 7.125 KB 27 rows
. . exported "HR"."COUNTRIES" 6.367 KB 25 rows
. . exported "HR"."REGIONS" 5.546 KB 4 rows
Result:
- The log table (
LOG_TABLE
) is exported with only 10% of its data. - All other core tables are exported completely.
- This ensures that foreign keys remain intact while keeping log tables small.
Importing Data Without Issues
impdp directory=oracle_base dumpfile=sample_one_table.dmp logfile=imp_sample_one_table.log remap_schema=hr:hrtest
Success! The import completes without any constraint issues.