External
Tables let you query data in a flat file as though the file were an Oracle
table. In 9i, only read operations were permitted; in 10g, you can also write
out data to an external table, although you can't write to an existing
table.
While
external tables can be queried, they're not usable in many ways regular Oracle
tables are. You cannot perform any DML operations on external tables other than
table creation; one consequence is that you can't create an index on an external
table. External tables are largely used as a convenient way of moving data into
and out of the database.
Oracle
uses SQL*Loader functionality, through the ORACLE_LOADER access driver to move
data from the flat file into the database; it uses a Data Pump access driver to
move data out of the db into a file in an Oracle-proprietary format, and back
into the database from files of that format. While there are some behavior
differences and restrictions, you can think of external tables as a convenient,
SQL-based way to use SQL*Loader and Data Pump functionality.
For
example, suppose that you receive a daily .csv report from another department.
Instead of writing a SQL*Loader script to import each day's .csv file into your
database, you can simply create an external table and write an "insert ...
select" SQL query to insert the data directly into your tables. Place the day's
CSV file in the location specified in the external table definition, run the
query, and you're done.
Since
an external table's data is in the operating system, its data file needs to be
in a place Oracle can access it. So the first step is to create a directory and
grant access to it.
First
create the directory in the operating system, or choose an existing directory.
It must be a real directory. Make sure that the OS user that the Oracle binaries
run as has read-write access to this directory. Note: Be sure not to use a
directory you should be keeping secure, such as an Oracle data file, program,
log or configuration file directory. And if the data you'll be putting there is
sensitive, make sure that other OS users don't have permissions on this
directory
$
cd /oracle/feeds/
$
mkdir xtern
$
mkdir xtern/mySID
$
mkdir xtern/mySID/data
$
ls -l /oracle/feeds/xtern/mySID
total
8
For
Windows Server:
Directly
Create directory on C:\Ext_Table
Put
the external table's data file in the data directory. In this example, I'll use
the following CSV file:
employee_report.csv:
001,Hutt,Jabba,896743856,jabba@thecompany.com,18
002,Simpson,Homer,382947382,homer@thecompany.com,20
003,Kent,Clark,082736194,superman@thecompany.com,5
004,Kid,Billy,928743627,billythkid@thecompany.com,9
005,Stranger,Perfect,389209831,nobody@thecompany.com,23
006,Zoidberg,Dr,094510283,crustacean@thecompany.com,1
You
must actually move or copy the file to the data directory; symlinks won't cut
it. Again, make sure that if the data is sensitive, only the Oracle user can
read or write to it.
The
next step is to create these directories in Oracle, and grant read/write access
on it to the Oracle user who will be creating the external table. When you
create the directory, be sure to use the directory's full path, and don't
include any symlinks in the path -- use the actual full path.
SQL>
connect sys as sysdba
Enter
password:
Connected.
SQL>
create or replace directory xtern_data_dir
2 as '/oracle/feeds/xtern/mySID/data';
Directory
created.
SQL>
grant read,write on directory xtern_data_dir to apps;
Grant
succeeded.
The
last step is to create the table. The CREATE TABLE statement for an external
table has two parts. The first part, like a normal CREATE TABLE, has the table
name and field specs. This is followed by a block of syntax specific to external
tables, which lets you tell Oracle how to interpret the data in the external
file.
SQL>
connect bulkload
Enter
password:
Connected.
SQL>
create table xtern_empl_rpt
2 ( empl_id varchar2(3),
3 last_name varchar2(50),
4 first_name varchar2(50),
5 ssn varchar2(9),
6 email_addr varchar2(100),
7 years_of_service number(2,0)
8 )
9 organization external
10
( default directory xtern_data_dir
11
access parameters
12
( records delimited by newline
13
fields terminated by ','
14
)
15
location ('employee_report.csv')
16
);
Table
created.
At
this point, Oracle hasn't actually tried to load any data. It doesn't attempt to
check the validity of many of the external-table-specific parameters you pass
it. The CREATE TABLE statement will succeed even if the external data file you
specify doesn't actually exist.
With
the create table statement, you've created table metadata in the data dictionary
and instructed Oracle how to direct the ORACLE_LOADER access driver to parse the
data in the datafile. Now, kick off the load by accessing the table:
SQL>
select * from xtern_empl_rpt ;
EMP
LAST_NAME FIRST_NAME SSN EMAIL_ADDR
YEARS_OF_SERVICE
---
---------- ---------- --------- ------------------------------
----------------
001
Hutt Jabba 896743856
jabba@thecompany.com 18
002
Simpson Homer 382947382 homer@thecompany.com
20
003
Kent Clark 082736194 superman@thecompany.com
5
004
Kid Billy 928743627 billythkid@thecompany.com
9
005
Stranger Perfect 389209831 nobody@thecompany.com
23
006
Zoidberg Dr 094510283 crustacean@thecompany.com
1
6
rows selected.
Oracle
used the ORACLE_LOADER driver to process the file, and just as with SQL*Loader,
it's created a log file that you can inspect to see what just happened. The log
file -- and the "bad" and "discard" files -- will have been written to the
directory you specified as the "default directory" in your CREATE TABLE
statement, and the file names default to tablename_ospid :
$
ls -l
total
16
-rw-r--r--
1 oracle oinstall 3652 Mar 1 19:41 XTERN_EMPL_RPT_26797.log
-rw-------
1 oracle oinstall 313 Mar 1 18:34 employee_report.csv
If
Oracle was unable to process the data given the access parameters you specified,
you'll get an error on the command line and in the log file, and there will also
be a bad and/or discard file. (Note: if you're copying and pasting data into
your external data file, be sure not to put a newline after the last record, or
SQL*Loader will expect a seventh record, and you'll get an error when you try to
select from the external table.)
You
may want to configure separate directories for the SQL*Loader output files --
the LOG file, the DISCARD file and the BAD file -- as well as for the external
table data. You can lump all four in the same directory, as we did in the
previous example, although it's a bad idea: a naming mishap could have you
overwriting one external table's data file with another's bad file. I like to
have one directory for data files, and one for log/bad/discard
files:
$
cd xtern/mySID
$
mkdir log
$
ls -l
total
16
drwx------
2 oracle oinstall 4096 Mar 1 17:33 data
drwx------
2 oracle oinstall 4096 Mar 1 17:32 log
Again,
these must be actual directories, not symlinks, and be sure to set the
permissions appropriately. To eliminate the possibility of any naming mishap,
you can grant READ access only on /.../data, and WRITE access only on /..../log,
to the user creating the external tables.
You
can use ALTER TABLE to change the access parameters without dropping and
redefining the whole table:
SQL>
alter table xtern_empl_rpt
2 access parameters
3 ( records delimited by newline
4 badfile xtern_log_dir:'xtern_empl_rpt.bad'
5 logfile xtern_log_dir:'xtern_empl_rpt.log'
6 discardfile xtern_log_dir:'xtern_empl_rpt.dsc'
7 fields terminated by ','
8 ) ;
Table
altered.
Alternatively,
you can set up the table so that no log, discard or bad files are generated.
SELECTing data from the table will still fail if the maximum number of rejects
is exceeded, just as in SQL*Loader. You can change the reject limit for an
external table with an ALTER TABLE statement:
SQL>
ALTER TABLE XTERN_EMPL_RPT SET REJECT_LIMIT 100;
Where
external tables really shine are in the ease with which you can load their data
into your tables. A particularly nice feature is that you can use any valid
function that the current Oracle user has rights on to transform the raw data
before loading it into your database tables. For example, suppose you had a
function, get_bday_from_ssn (ssn in varchar2) that looked up an employee's birth
date given their SSN. You can use that function to populate a BIRTH_DATE column
in your local database table in the same step as you load the data into
it.
SQL>
create table empl_info as
2 (select empl_id, last_name, first_name, ssn, get_bday_from_ssn (ssn)
birth_dt
3* from xtern_empl_rpt)
SQL>
/
Table
created.
SQL>
select * from empl_info ;
EMP
LAST_NAME FIRST_NAME SSN BIRTH_DT
---
---------- ---------- --------- ----------
001
Hutt Jabba 896743856 03/11/1939
002
Simpson Homer 382947382 11/01/1967
003
Kent Clark 082736194 01/15/1925
004
Kid Billy 928743627 07/20/1954
005
Stranger Perfect 389209831 10/23/1980
006
Zoidberg Dr 094510283 04/04/2989
6
rows selected.
Unloading
data into an external file...
Oracle
10g lets you create a new external table from data in your database, which goes
into a flat file pushed from the database using the ORACLE_DATAPUMP access
driver. This flat file is in an Oracle-proprietary format that can be read by
DataPump. The syntax is similar to the CREATE TABLE... ORGANIZATION EXTERNAL
above, but simpler -- since you can't specify the data format, you can specify
very few access_parameters. The key difference is that you must specify the
access driver, ORACLE_DATAPUMP, since the access driver defaults to
ORACLE_LOADER.
SQL>
create table export_empl_info
2 organization external
3 ( type oracle_datapump
4 default directory xtern_data_dir
5 location ('empl_info_rpt.dmp')
6* ) as select * from empl_info
SQL>
/
Table
created.
SQL>
select * from export_empl_info ;
EMPL_ID
LAST_NAME FIRST_NAME SSN BIRTH_DT
-------
--------------- --------------- --------- ----------
001
Hutt Jabba 896743856 01/01/1979
002
Simpson Homer 382947382 01/01/1979
003
Kent Clark 082736194 01/01/1979
004
Kid Billy 928743627 01/01/1979
005
Stranger Perfect 389209831 01/01/1979
006
Zoidberg Dr 094510283 01/01/1979
6
rows selected.
...
and back in again
You
can now move the file you just created, empl_info_rpt.dmp, to another system and
create an external table to read the data:
SQL>
connect apps/xxxx@TEST
Connected.
SQL>
create table import_empl_info
2 ( empl_id varchar2(3),
3 last_name varchar2(50),
4 first_name varchar2(50),
5 ssn varchar2(9),
6 birth_dt date
7 )
8 organization external
9 ( type oracle_datapump
10
default directory xtern_data_dir
11
location ('empl_info_rpt.dmp')
12
) ;
Table
created.
SQL>
select * from import_empl_info ;
EMPL_ID
LAST_NAME FIRST_NAME SSN BIRTH_DT
-------
--------------- --------------- --------- ----------
001
Hutt Jabba 896743856 01/01/1979
002
Simpson Homer 382947382 01/01/1979
003
Kent Clark 082736194 01/01/1979
004
Kid Billy 928743627 01/01/1979
005
Stranger Perfect 389209831 01/01/1979
006
Zoidberg Dr 094510283 01/01/1979
6
rows selected.
No comments:
Post a Comment