Table of Contents Previous Next


11 Utilities : 11.2 EDB*Loader : 11.2.3 Building the EDB*Loader Control File

[ OPTIONS (param=value [, param=value ] ...) ]
[ INFILE '{ data_file | stdin }' ]
[ BADFILE 'bad_file' ]
[ DISCARDFILE 'discard_file' ]
{ INTO TABLE target_table
[ WHEN field_condition [ AND field_condition ] ...]
(field_def [, field_def ] ...)
where field_def defines a data field, or simply, a field, in data_file that describes the location, data format, or value of the data to be inserted into column_name of target_table. The syntax of field_def is the following:
FILLER [ POSITION (start:end) ] [ fieldtype ] |
[ POSITION (start:end) ] [ fieldtype ]
where fieldtype is one of:
The specification of data_file, bad_file, and discard_file may include the full directory path or a relative directory path to the file name. If the file name is specified alone or with a relative directory path, the file is then assumed to exist (in the case of data_file), or is created (in the case of bad_file or discard_file), relative to the current working directory from which edbldr is invoked.
The directory path and/or file name may be specified by using environment variables. For Linux, the format is $ENV_VARIABLE or ${ENV_VARIABLE} where ENV_VARIABLE is the environment variable set to the directory path and/or file name. For Windows, use %ENV_VARIABLE%.
The operating system account enterprisedb must have read permission on the directory and file specified by data_file.
The operating system account enterprisedb must have write permission on the directories where bad_file and discard_file are to be written.
Note: It is suggested that the file names for data_file, bad_file, and discard_file include extensions of .dat, .bad, and .dsc, respectively. If the provided file name does not contain an extension, EDB*Loader assumes the actual file name includes the appropriate aforementioned extension.
If an EDB*Loader session results in data format errors and the BADFILE clause is not specified, nor is the BAD parameter given on the command line when edbldr is invoked, a bad file is created with the name control_file_base.bad in the current working directory from which edbldr is invoked. control_file_base is the base name of the control file (that is, the file name without any extension) used in the edbldr session.
The DISCARDFILE clause for specifying the discard file is not included in the control file.
The DISCARD parameter for specifying the discard file is not included on the command line.
The DISCARDMAX clause for specifying the maximum number of discarded records is not included in the control file.
The DISCARDS clause for specifying the maximum number of discarded records is not included in the control file.
The DISCARDMAX parameter for specifying the maximum number of discarded records is not included on the command line.
If neither the DISCARDFILE clause nor the DISCARD parameter for explicitly specifying the discard file name are specified, but DISCARDMAX or DISCARDS is specified, then the EDB*Loader session creates a discard file using the data file name with an extension of .dsc.
Note: There is a distinction between keywords DISCARD and DISCARDS. DISCARD is an EDB*Loader command line parameter used to specify the discard file name (see Section 11.2.4). DISCARDS is a clause of the LOAD DATA directive that may only appear in the control file. Keywords DISCARDS and DISCARDMAX provide the same functionality of specifying the maximum number of discarded records allowed before terminating the EDB*Loader session. Records loaded into the database before termination of the EDB*Loader session due to exceeding the DISCARDS or DISCARDMAX settings are kept in the database and are not rolled back.
If one of INSERT, APPEND, REPLACE, or TRUNCATE is specified, it establishes the default action of how rows are to be added to target tables. If omitted, the default action is as if INSERT had been specified.
If the FIELDS TERMINATED BY clause is specified, then the POSITION (start:end) clause may not be specified for any field_def. Alternatively if the FIELDS TERMINATED BY clause is not specified, then every field_def must contain the POSITION (start:end) clause, excluding those with the CONSTANT clause.
OPTIONS param=value
Use the OPTIONS clause to specify param=value pairs that represent an EDB*Loader directive. If a parameter is specified in both the OPTIONS clause and on the command line when edbldr is invoked, the command line setting is used.
If DIRECT is set to TRUE EDB*Loader performs a direct path load instead of a conventional path load. The default value of DIRECT is FALSE.
See Section 11.2.5 for information on direct path loads.
ERRORS=error_count
error_count specifies the number of errors permitted before aborting the EDB*Loader session. The default is 50.
Set PARALLEL to TRUE to indicate that this EDB*Loader session is one of a number of concurrent EDB*Loader sessions participating in a parallel direct path load. The default value of PARALLEL is FALSE.
When PARALLEL is TRUE, the DIRECT parameter must also be set to TRUE . See Section 11.2.6 for more information about parallel direct path loads.
n specifies the number of rows that EDB*Loader will commit before loading the next set of n rows.
If EDB*Loader encounters an invalid row during a load (in which the ROWS parameter is specified), those rows committed prior to encountering the error will remain in the destination table.
SKIP=skip_count
skip_count specifies the number of records at the beginning of the input data file that should be skipped before loading begins. The default is 0.
If SKIP_INDEX_MAINTENANCE is TRUE, index maintenance is not performed as part of a direct path load, and indexes on the loaded table are marked as invalid. The default value of SKIP_INDEX_MAINTENANCE is FALSE.
You can use the REINDEX command to rebuild an index. For more information about the REINDEX command, see the PostgreSQL core documentation, available at:
Use the CHARACTERSET clause to identify the character set encoding of data_file where charset is the character set name. This clause is required if the data file encoding differs from the control file encoding. (The control file encoding must always be in the encoding of the client where edbldr is invoked.)
Examples of charset settings are UTF8, SQL_ASCII, and SJIS.
File containing the data to be loaded into target_table. Each record in the data file corresponds to a row to be inserted into target_table.
Note: If the DATA parameter is specified on the command line when edbldr is invoked, the file given by the command line DATA parameter is used instead.
If the INFILE clause is omitted as well as the command line DATA parameter, then the data file name is assumed to be identical to the control file name, but with an extension of .dat.
Specify stdin (all lowercase letters) if you want to use standard input to pipe the data to be loaded directly to EDB*Loader. This is useful for data sources generating a large number of records to be loaded.
File that receives data_file records that cannot be loaded due to errors.
Note: If the BAD parameter is specified on the command line when edbldr is invoked, the file given by the command line BAD parameter is used instead.
File that receives input data records that are not loaded into any table because none of the selection criteria are met for tables with the WHEN clause, and there are no tables without a WHEN clause. (All records meet the selection criteria of a table without a WHEN clause.)
Note: If the DISCARD parameter is specified on the command line when edbldr is invoked, the file given by the command line DISCARD parameter is used instead.
For example, if max_discard_recs is 0, then the EDB*Loader session is terminated if and when a first discarded record is encountered. If max_discard_recs is 1, then the EDB*Loader session is terminated if and when a second discarded record is encountered.
When the EDB*Loader session is terminated due to exceeding max_discard_recs, prior input data records that have been loaded into the database are retained. They are not rolled back.
Specifies how data is to be loaded into the target tables. If one of INSERT, APPEND, REPLACE, or TRUNCATE is specified, it establishes the default action for all tables, overriding the default of INSERT.
Note: If the table contains rows, the TRUNCATE command must be used to empty the table prior to invoking EDB*Loader. EDB*Loader throws an exception if the DELETE command is used to empty the table instead of the TRUNCATE command. Oracle SQL*Loader allows the table to be emptied by using either the DELETE or TRUNCATE command.
The REPLACE keyword and TRUNCATE keywords are functionally identical. The table is truncated by EDB*Loader prior to loading the new data.
Note: Delete triggers on the table are not fired as a result of the REPLACE operation.
[ ( ] (start:end) { = | != | <> } 'val' [ ) ]
start and end are positive integers specifying the column positions in data_file that mark the beginning and end of a field that is to be compared with the constant val. The first character in each record begins with a start value of 1.
In the WHEN field_condition [ AND field_condition ] clause, if all such conditions evaluate to true for a given record, then EDB*Loader attempts to insert that record into target_table. If the insert operation fails, the record is written to bad_file.
All characters used in the field_condition text (particularly in the val string) must be valid in the database encoding. (For performing data conversion, EDB*Loader first converts the characters in val string to the database encoding and then to the data file encoding.)
If for a given record, none of the WHEN clauses evaluate to true for all INTO TABLE clauses, the record is written to discard_file, if a discard file was specified for the EDB*Loader session.
String of one or more characters that separates each field in data_file. The characters may be single-byte or multi-byte as long as they are valid in the database encoding. Two consecutive appearances of termstring with no intervening character results in the corresponding column set to null.
String of one or more characters used to enclose a field value in data_file. The characters may be single-byte or multi-byte as long as they are valid in the database encoding. Use enclstring on fields where termstring appears as part of the data.
If TRAILING NULLCOLS is specified, then the columns in the column list for which there is no data in data_file for a given record, are set to null when the row is inserted. This applies only to one or more consecutive columns at the end of the column list.
If fields are omitted at the end of a record and TRAILING NULLCOLS is not specified, EDB*Loader assumes the record contains formatting errors and writes it to the bad file.
Name of a column in target_table into which a field value defined by field_def is to be inserted.
The use of the CONSTANT clause completely determines the value to be assigned to a column in each inserted row. No other clause may appear in the same field definition.
If the TERMINATED BY clause is used to delimit the fields in data_file, there must be no delimited field in data_file corresponding to any field definition with a CONSTANT clause. In other words, EDB*Loader assumes there is no field in data_file for any field definition with a CONSTANT clause.
A column name defined with the FILLER clause must not be referenced in a SQL expression. See the discussion of the expr parameter.
POSITION (start:end)
Defines the location of the field in a record in a fixed-width field data file. start and end are positive integers. The first character in the record has a start value of 1.
Note: Specification of any of these field types is optional. All data in data_file must be in human-readable, text form. Specification of a field type is for descriptive purposes only and has no effect on whether or not EDB*Loader successfully inserts the data in the field into the table column. Successful loading depends upon the column data type and its compatibility with the field value.
For example, a column with data type NUMBER(7,2) successfully accepts a field containing 2600, but if the field contains a value such as 26XX, the insertion fails and the record is written to bad_file.
See Section 3.5.7 for date mask formatting information.
Note: If the DATE field type is specified along with a SQL expression for the column, then datemask must be specified after DATE and before the SQL expression. See the following discussion of the expr parameter.
A SQL expression returning a scalar value that is type-compatible with the column data type to which it is assigned in a field definition. Double quotes must enclose expr. expr may contain a reference to any column in the field list (except for fields with the FILLER clause) by prefixing the column name by a colon character (:).
In the preceding control file, the APPEND clause is used to allow the insertion of additional rows into the emp table.
The use of the TRAILING NULLCOLS clause allows the last field supplying the comm column to be omitted from the first and last records. The comm column is set to null for the rows inserted from these records.
The double quotation mark enclosure character surrounds the value JONES, JR. in the last record since the comma delimiter character is part of the field value.
The following example is a control file that loads the same rows into the emp table, but uses a data file containing fixed-width fields:
In the preceding control file, the FIELDS TERMINATED BY and OPTIONALLY ENCLOSED BY clauses are absent. Instead, each field now includes the POSITION clause.
The following control file illustrates the use of the FILLER clause in the data fields for the sal and comm columns. EDB*Loader ignores the values in these fields and sets the corresponding columns to null.
The following example illustrates the use of multiple INTO TABLE clauses. For this example, two empty tables are created with the same data definition as the emp table. The following CREATE TABLE commands create these two empty tables, while inserting no rows from the original emp table:
The following control file contains two INTO TABLE clauses. Also note that there is no APPEND clause so the default operation of INSERT is used, which requires that tables emp_research and emp_sales be empty.
The WHEN clauses specify that when the field designated by columns 47 thru 48 contains 20, the record is inserted into the emp_research table and when that same field contains 30, the record is inserted into the emp_sales table. If neither condition is true, the record is written to the discard file named emp_multitbl.dsc.
The CONSTANT clause is given for column deptno so the specified constant value is inserted into deptno for each record. When the CONSTANT clause is used, it must be the only clause in the field definition other than the column name to which the constant value is assigned.
Finally, column comm of the emp_sales table is assigned a SQL expression. Column names may be referenced in the expression by prefixing the column name with a colon character (:).
Since the records for employees ARNOLDS and JACKSON contain 10 and 40 in columns 47 thru 48, which do not satisfy any of the WHEN clauses, EDB*Loader writes these two records to the discard file, emp_multitbl.dsc, whose content is shown by the following:
The following are the rows loaded into the emp_research and emp_sales tables:

11 Utilities : 11.2 EDB*Loader : 11.2.3 Building the EDB*Loader Control File

Table of Contents Previous Next